Forum Discussion

peterlu's avatar
peterlu
Champion
5 years ago

html entity decode

Hi,

 

We all know that freemarker can encode special html characters( &, quotes, < > and more),

eg. & => &amp;

But I need to way to decode it.

eg. &amp;  => &

I have looked around. It looks like there is no such method built-in to freemarker?

 

Peter

 

  • Hi Peter, you are correct that FreeMarker did not implement HTML processing like decoding. This was deliberate: FreeMarker's primary use case is to output HTML, so to prevent common security vulnerabilities they designed it so that you work with raw data, and escaping for HTML is the final step before outputting. We're stretching FreeMarker past its design goals, doing much more complex things like API calls and data transformation -- so we did add a utility:

    utils.html.unescaper.unescape(...).

    The usual caveats apply:

    • FreeMarker is not a programming language; it is primarily designed for displaying data and is not great at converting or building data. If the message body is being sent somewhere else, it may be a better idea to implement all the HTML post-processing in that environment. This would also make it easier to tweak the algorithm, and it might make user requests faster by offloading some work from the community.
    • Unescaping should only be done on one continuous block of HTML text, with no HTML elements present. That is, all elements must be removed.
    • Even if you strip all HTML, the unescape may result in HTML (because the original escaped text may look like HTML). Once unescaped, make sure it is never accidentally placed directly in an HTML context without proper escaping.

10 Replies

  • peterlu,

    Why you need to decode this? If you are requesting any text to endpoint you can directly "?url" to variable, so that it will pass as url encode and you can get the same using url decode or directly to any API or endpoint from get request parameter.

  • peterlu's avatar
    peterlu
    Champion
    5 years ago

    Parshant I am not using the decode for the url parameter. After I get the rest api query result, the message body data is the html format. I am using the utils function to strip the htm tags. And then I also need to html entities decode to pass to the JSON data to send to a third party system. The third party system needs to build AI report around it. They want pure characters instead of html entities like &amp; &apos; etc. I know there is a way to use string replace to do that, but that is dirty. Java object should have a way to do this, and the utils freemarker object should have this built-in.

     

    The question is purely: Is there a way to html entities decode in freemarker object. Thanks.

  • peterlu's avatar
    peterlu
    Champion
    5 years ago

    phoneboy  the data source contains eg. "&amp;" 

    <#assign test = "&amp; is a special character" />

    ${test?no_esc}  won't work. I need a utility function to convert escaped html characters to original form.

    Freemarker has a ?html syntax to escape characters. I need a ?undo_html something like that. Or a util.html.decode() method that Lithium can supply out-of-the-box.

  • Depending on the use case I generally just use the noautoesc directive.

    For example:

    <#assign test = "&amp; is a special character" />
    
    <h3>Original:</h3>
    ${test}
    
    <h3>New:</h3>
    <#noautoesc>
      ${test}
    </#noautoesc>

    Output:

    Original:
    &amp; is a special character
    New:
    & is a special character

     

    Hope this helps!

  • peterlu's avatar
    peterlu
    Champion
    5 years ago

    Let me simplify the use case 🙂 

    I need to find "some_decode_function". Maybe Khoros can add it to the utility object.

    <#assign old = "&amp;" />
    <#assign new = some_decode_function(old) />
    <#if new == "&" && new != "&amp;">
    This is a solution. Thanks.
    </#if>

     

  • peterlu that's odd because within Studio and any custom components/macros/functions/endpoints I've written it works fine. (See below) Maybe Khoros and the template tester are running different FreeMarker versions?

  • peterlu's avatar
    peterlu
    Champion
    5 years ago

    jeffshurtliff It is still not working. If you view page source, you will see what I mean. We are actually getting :

     

     

    I am getting:
    &amp;amp;
    and
    &amp;
    
    What I need is just:
    &

     

     

     

     

    AndrewF  I found your post in a similar topic. Maybe you can help shed some light on it? Thanks.

  • AndrewF's avatar
    AndrewF
    Khoros Oracle
    5 years ago

    Hi Peter, you are correct that FreeMarker did not implement HTML processing like decoding. This was deliberate: FreeMarker's primary use case is to output HTML, so to prevent common security vulnerabilities they designed it so that you work with raw data, and escaping for HTML is the final step before outputting. We're stretching FreeMarker past its design goals, doing much more complex things like API calls and data transformation -- so we did add a utility:

    utils.html.unescaper.unescape(...).

    The usual caveats apply:

    • FreeMarker is not a programming language; it is primarily designed for displaying data and is not great at converting or building data. If the message body is being sent somewhere else, it may be a better idea to implement all the HTML post-processing in that environment. This would also make it easier to tweak the algorithm, and it might make user requests faster by offloading some work from the community.
    • Unescaping should only be done on one continuous block of HTML text, with no HTML elements present. That is, all elements must be removed.
    • Even if you strip all HTML, the unescape may result in HTML (because the original escaped text may look like HTML). Once unescaped, make sure it is never accidentally placed directly in an HTML context without proper escaping.