Getting Unicode Character Codes in JavaScript? 26
jargonCCNA asks: "I've searched high and low across the web, but I can't seem to be able to find any code snippets or even anything that'll help me out here. I'm trying to get a Unicode character code from a data stream in JavaScript and there doesn't seem to be anything out there to help me; JavaScript itself only has onboard support for ISO-Latin_1, or something. I tried hacking my own converter code, but it's rife with errors. Anybody know of some code that I can include in a GPL project?"
"Here's the buggy code, if you're interested:
"Mozilla's JavaScript console lets me know that '\u0' is an illegal character. I think this would work if I could make it use the string "0000" instead of the number 0 for i.function unicode2hex( unicode )
{}var hexString = "";
for( var i = 0x0000; i <= 0xFFFF; i++ )
{}test = eval( "\\u" + i );
if ( unicode == test )
{}hexString += i / 4096;
hexString += i / 256;
hexString += i / 16;
hexString += i % 16;
hexString += "";
return hexString;
return false;
Just for reference -- I've seen a lot of people get nailed on Ask /. because they didn't do the proper research before asking their question. Google has failed me; I've been trying to figure this out on my own for about a month. I hope someone can shed some light on my situation."
One question (Score:5, Funny)
Re:One question (Score:2)
Story submissions (Score:1)
How did this story get past the lameness filter?
Stories are probably not subject to the lameness filter (or at least they have looser filters) because an editor must approve each story by hand.
That said, I have a possible (untested) solution: Try changing each += in the inner loop to a +=""+ to force the strings to be concatenated rather than treated as numbers.
Ask the Experts (Score:1)
Isn't this a question for developer.net? (Score:2)
Try asking your question in IRC before hitting up "Ask Slashdot."
A search on google for unicode and javascript brings back a lot of positive looking results without actually delving into them. It seems like JS1.5 has support for this (from the Google summaries).
Re:Isn't this a question for developer.net? (Score:1)
Yeah, positive looking. That's the thing. Looks are exceedingly deceiving on a search engine. Try actually delving in; I can almost guarantee that it won't convert Unicode characters to their character codes.
Now for the answer.. (Score:1)
document.write("\u00A9 Netscape Communications" );
I just did that in Galeon and it works fine...
See - http://developer.netscape.com/docs/manuals/js/cor
Re:Now for the answer.. (Score:1)
Re:Now for the answer.. (Score:1)
document.write("\u00A9".charCodeAt(0));
That provides the decimal, then you just have to convert to hex.
function Dec2Hex (Dec) { var a=Dec % 16; var b=(Dec - a)/16; hex="" + hexChars.charAt(b) + hexChars.charAt(a); return hex; }
Blatently ripped off from here [internet.com]
Re:Wrong topic Cliff, you cockfoster (Score:1)
Lick your own.
Straight to the source! (Score:1)
Why don't you ask the Mozilla developers that are working on JavaScript 2.0?
Did you try looking at the docs? (Score:5, Informative)
document.write("\u00A9 is ");
document.write("\u00A9".charCodeAt(0));
That will give you the answer in decimal. I trust you can convert to hex yourself.
(Note: Requires Javascript 1.3; previous versions used ISO-Latin-1 rather than unicode, and I don't know what they'd do with a character higher than 255.)
Re:Did you try looking at the docs? (Score:1)
I looked through all the documentation I could find; the only thing I found about charCodeAt() was that it use ISO-Latin.. But I think they also said they were JavaScript 1.2-specific.
Any idea what version of JavaScript IE6 emulates, and Mozilla actually uses?
Re:Did you try looking at the docs? (Score:1)
(Similar code with characters outside the range of Latin-1 also works on both, though the browsers sometimes display the "no glyph for that" glyph (open box for IE, "?" for NS/Moz).
Couldn't tell you what JS versions each browser actually uses, though.
Re:Did you try looking at the docs? (Score:1)
But, I'm assuming that IE will just use whatever version of JScript you happen to have installed on your machine. And, as far as I know, JScript really does follow the ECMAScript specification, which is a real spec, with standards bodies and the whole works, unlike "JavaScript", whatever that is, exactly.
Anyhow, take a look here [microsoft.com] to get a look at some of the features of the JScript interpreter hosted in some of your favorite applications.
Re:Did you try looking at the docs? (Score:1)
IE6 doesn't emulate JavaScript. It uses JScript, which is Microsoft's implimentation of the ECMA-262 Edition 3 language standard (ECMAScript). Similarly, JavaScript is Netscape's implementation of the same standard. Neither is "emulating" anything.
You can find the ECMAScript standard here: ECMA-262v3 [www.ecma.ch]. You can discover what your favorite vendor has actually implemented by visiting either mozilla [mozilla.com] and microsoft [microsoft.com] documentation for each vendor's implementation.
Re:Did you try looking at the docs? (Score:1)
Re:Did you try looking at the docs? (Score:1)
Re:Did you try looking at the docs? (Score:1)
Re:Did you try looking at the docs? (Score:2)
function tounicode(instr) {
len = instr.length;
switch (len) {
case 1:
return instr.charCodeAt(0);
case 2:
return new String(instr.charCodeAt(1)) + new String(instr.charCodeAt(0));
case 3:
return instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
case 4:
return instr.charCodeAt(3) + instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
}
return "";
}
document.write(tounicode("\u002d") + " " + tounicode("-") + "
");
With this you can take a string like "fooo" with a unicode equivalant.