Announcement

Collapse
No announcement yet.

Diacritic Character removal. Those strange non text characters...

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Diacritic Character removal. Those strange non text characters...

    Has anyone ever come up with a universal way to removed all control and special characters from stings in Miva Script?

    Removing characters below char 32 and above char 127 does not work because some, but not all, of the codes are two bytes long.

    Every example I find uses Regex expressions or functions built into the language.

    Code:
    Here are JavaScript examples:
    
    const str = "Crème Brulée"
    str.normalize("NFD").replace(/[\u0300-\u036f]/g, "")
    >"Creme Brulee"
    
    Or the more modern version:
    str.normalize("NFD").replace(/\p{Diacritic}/gu, "")
    Last edited by RayYates; 07-27-21, 06:21 AM.
    Ray Yates
    "If I have seen further, it is by standing on the shoulders of giants."
    --- Sir Isaac Newton

    #2
    Just out of curiosity, why do you want to remove them? Any business that wants to support a language other than English may need them. Are they causing any specific problems?
    Kent Multer
    Magic Metal Productions
    http://TheMagicM.com
    * Web developer/designer
    * E-commerce and Miva
    * Author, The Official Miva Web Scripting Book -- available on-line:
    http://www.amazon.com/exec/obidos/IS...icmetalproducA

    Comment


      #3
      You could iterate the string and use the "isprint" builtin and remove non-printable characters that way. However that will not replace the characters with their closet "ascii" character. If the text is UTF-8 you should be able to detect that a multi-byte sequence and know how many bytes to remove too.
      David Carver
      Miva, Inc. | Software Developer

      Comment

      Working...
      X