Skip to content Skip to sidebar Skip to footer

Remove Html Formatting From Pandas Cell

I have this DataFrame on pandas: import pandas as pd df = pd.DataFrame({'CARGO': {53944: 'Driver', 57389: 'Driver', 60851: 'Driver', 64322: 'Driver', 67771: 'Driver'}, 'DATE

Solution 1:

You can try these things out, replace the new line character with html line break tag <br> and explicitly using .to_html() and HTML for the display, and also set the max_colwidth to be -1 so that the long line will not be truncated when converting to html:

from IPython.core.display import HTML
pd.set_option('display.max_colwidth', -1)
df['DESCRICAO'] = df['DESCRICAO'].str.replace('\$', '\\$').str.replace('\n', '<br>')
HTML(df.to_html(escape=False))

enter image description here

Solution 2:

The first part of the problem was solved.

On markdown, $ represents the start of a formula on mathjax. The solution is to inser a backslash before the symbol. Here is the snippet for pandas:

def fix_dollar_sign(x):
   return re.sub('\$','\\$',x) # remember regex also uses dollar sign.df['DESCRICAO'] = df['DESCRICAO'].apply(fix_dollar_sign)

.

I wasn´t able to make a new line inside the cell..

Solution 3:

Expanding on Psidom's excellent answer, you can encapsulate it in a re-usable function. This way you won't alter your dataframe permanently either:

from IPython.core.display import HTML

defconvert_newlines(s):
    return s.replace('\n', '<br>') ifisinstance(s, str) else s

defshow_dataframe(df):
    return HTML(df.applymap(convert_newlines).to_html(escape=False)) 

Solution 4:

This opens up some interesting possibilities, like highliting some text on the dataframe html. Here is my try:

def highlight_text_on_descricao(df_rubrica = tab, texto='', cor='red'):
    def marca_texto(x,text,color):

        x, text, color, = str(x).upper(), str(text).upper(), str(color).lower()
        marcador_primario =  [m.start() for m in re.finditer(text , x)]
        if marcador_primario == []:
            return re.sub('\$','\\$',re.sub('\n','<br>',x))
        contexto = ''for item in marcador_primario:
            marcador_inicio = x[:item].rfind('\n')

            if marcador_inicio == -1:
                marcador_inicio = 0
            marcador_final = x.find("\n",item + 1) 
            if marcador_final == -1:
                contexto +=  "<font color='" + color + "'><b> " + x[marcador_inicio:]
            else:
                contexto +=  "<font color='" + color + "'><b> " + x[marcador_inicio:marcador_final
                            ] + '</font color></b>'
        marcador_do_primeiro_vermelho = x[:marcador_primario[0]].rfind('\n')
        if marcador_do_primeiro_vermelho == -1:
            descricao =  contexto + x[marcador_final:]
        else:
            descricao =  x[:marcador_do_primeiro_vermelho] + contexto + x[marcador_final:]
        return re.sub('\$','\\$',re.sub('\n','<br>',descricao))
    df_temp = df_rubrica
    df_temp = df_temp.rename(columns={'DESCRICAO':'DESCRICAO_LONG_TEXT_STRING____'})
    df_temp['DESCRICAO_LONG_TEXT_STRING____'] = df_temp['DESCRICAO_LONG_TEXT_STRING____'].apply(marca_texto,args=(texto,cor,))
    display(HTML(df_temp.to_html(escape=False)))

highlight_text_on_descricao(tab,'GRATIFICAÇÃO')

yelds:

enter image description here

(by the way, I got added some stuff on custom.css from Henry Hammond(https://github.com/HHammond/PrettyPandas), so that´s why headers and indexes are grey.

Post a Comment for "Remove Html Formatting From Pandas Cell"