Internal Storage encoding of Characters

ASCII

The American Standard Code for Information Interchange is a character-encoding scheme originally based on the English alphabet that encodes 128 specified characters – the numbers 0-9, the letters a-z and A-Z, some basic punctuation symbols, some control codes that originated with Teletype machines, and a blank space – into the 7-bit binary integers.ASCII codes represent text in computers, communications equipment, and other devices that use text. Most modern character-encoding schemes are based on ASCII, though they support many additional characters.

ASCII printable code chart

Binary

Oct

Dec

Hex

Glyph

010 0000

040

32

20

(space)

010 0001

041

33

21

!

010 0010

042

34

22

010 0011

043

35

23

#

010 0100

044

36

24

$

010 0101

045

37

25

%

010 0110

046

38

26

&

010 0111

047

39

27

010 1000

050

40

28

(

010 1001

051

41

29

)

010 1010

052

42

2A

*

010 1011

053

43

2B

+

010 1100

054

44

2C

,

010 1101

055

45

2D

010 1110

056

46

2E

.

010 1111

057

47

2F

/

011 0000

060

48

30

0

011 0001

061

49

31

1

011 0010

062

50

32

2

011 0011

063

51

33

3

011 0100

064

52

34

4

011 0101

065

53

35

5

011 0110

066

54

36

6

011 0111

067

55

37

7

011 1000

070

56

38

8

011 1001

071

57

39

9

011 1010

072

58

3A

:

011 1011

073

59

3B

;

011 1100

074

60

3C

<

011 1101

075

61

3D

=

011 1110

076

62

3E

>

011 1111

077

63

3F

?

Binary

Oct

Dec

Hex

Glyph

100 0000

100

64

40

@

100 0001

101

65

41

A

100 0010

102

66

42

B

100 0011

103

67

43

C

100 0100

104

68

44

D

100 0101

105

69

45

E

100 0110

106

70

46

F

100 0111

107

71

47

G

100 1000

110

72

48

H

100 1001

111

73

49

I

100 1010

112

74

4A

J

100 1011

113

75

4B

K

100 1100

114

76

4C

L

100 1101

115

77

4D

M

100 1110

116

78

4E

N

100 1111

117

79

4F

O

101 0000

120

80

50

P

101 0001

121

81

51

Q

101 0010

122

82

52

R

101 0011

123

83

53

S

101 0100

124

84

54

T

101 0101

125

85

55

U

101 0110

126

86

56

V

101 0111

127

87

57

W

101 1000

130

88

58

X

101 1001

131

89

59

Y

101 1010

132

90

5A

Z

101 1011

133

91

5B

[

101 1100

134

92

5C

\

101 1101

135

93

5D

]

101 1110

136

94

5E

^

101 1111

137

95

5F

_

Binary

Oct

Dec

Hex

Glyph

110 0000

140

96

60

`

110 0001

141

97

61

a

110 0010

142

98

62

b

110 0011

143

99

63

c

110 0100

144

100

64

d

110 0101

145

101

65

e

110 0110

146

102

66

f

110 0111

147

103

67

g

110 1000

150

104

68

h

110 1001

151

105

69

i

110 1010

152

106

6A

j

110 1011

153

107

6B

k

110 1100

154

108

6C

l

110 1101

155

109

6D

m

110 1110

156

110

6E

n

110 1111

157

111

6F

o

111 0000

160

112

70

p

111 0001

161

113

71

q

111 0010

162

114

72

r

111 0011

163

115

73

s

111 0100

164

116

74

t

111 0101

165

117

75

u

111 0110

166

118

76

v

111 0111

167

119

77

w

111 1000

170

120

78

x

111 1001

171

121

79

y

111 1010

172

122

7A

z

111 1011

173

123

7B

{

111 1100

174

124

7C

|

111 1101

175

125

7D

}

111 1110

176

126

7E

~

ISCII

Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India. It encodes the main Indic scripts and a Roman transliteration. The supported scripts are: Assamese, Bengali (Bengla), Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu. ISCII does not encode the writing systems of India based on Arabic, but its writing system switching codes nonetheless provide for Kashmiri, Sindhi, Urdu, Persian, Pashto and Arabic. 

The following table shows the character set for Devanagari. The code sets for Assamese, Bengali, Gujarati, Gurmukhi, Kannada, Malayalam, Oriya, Tamil, and Telugu are similar, with each Devanagari form replaced by the equivalent form in each writing system. Each character is shown with its decimal code and its Unicode equivalent.

ISCII Devanagari

 

—0

—1

—2

—3

—4

—5

—6

—7

—8

—9

—A

—B

—C

—D

—E

—F

 

0_

 

NUL

0000

0

SOH

0001

1

STX

0002

2

ETX

0003

3

EOT

0004

4

ENQ

0005

5

ACK

0006

6

BEL

0007

7

BS

0008

8

HT

0009

9

LF

000A

10

VT

000B

11

FF

000C

12

CR

000D

13

SO

000E

14

SI

000F

15

 

1_

 

DLE

0010

16

DC1

0011

17

DC2

0012

18

DC3

0013

19

DC4

0014

20

NAK

0015

21

SYN

0016

22

ETB

0017

23

CAN

0018

24

EM

0019

25

SUB

001A

26

ESC

001B

27

FS

001C

28

GS

001D

29

RS

001E

30

US

001F

31

 

2_

 

SP

0020

32

!

0021

33

0022

34

#

0023

35

$

0024

36

%

0025

37

&

0026

38

0027

39

(

0028

40

)

0029

41

*

002A

42

+

002B

43

,

002C

44

002D

45

.

002E

46

/

002F

47

 

3_

 

0

0030

48

1

0031

49

2

0032

50

3

0033

51

4

0034

52

5

0035

53

6

0036

54

7

0037

55

8

0038

56

9

0039

57

:

003A

58

;

003B

59

<

003C

60

=

003D

61

>

003E

62

?

003F

63

 

4_

 

@

0040

64

A

0041

65

B

0042

66

C

0043

67

D

0044

68

E

0045

69

F

0046

70

G

0047

71

H

0048

72

I

0049

73

J

004A

74

K

004B

75

L

004C

76

M

004D

77

N

004E

78

O

004F

79

 

5_

 

P

0050

80

Q

0051

81

R

0052

82

S

0053

83

T

0054

84

U

0055

85

V

0056

86

W

0057

87

X

0058

88

Y

0059

89

Z

005A

90

[

005B

91

\

005C

92

]

005D

93

^

005E

94

_

005F

95

 

6_

 

`

0060

96

a

0061

97

b

0062

98

c

0063

99

d

0064

100

e

0065

101

f

0066

102

g

0067

103

h

0068

104

i

0069

105

j

006A

106

k

006B

107

l

006C

108

m

006D

109

n

006E

110

o

006F

111

 

7_

 

p

0070

112

q

0071

113

r

0072

114

s

0073

115

t

0074

116

u

0075

117

v

0076

118

w

0077

119

x

0078

120

y

0079

121

z

007A

122

{

007B

123

|

007C

124

}

007D

125

~

007E

126

DEL

007F

127

 

8_

 

                               

 

9_

 

                               

 

A_

 

 

0901

161

0902

162

0903

163

0905

164

0906

165

0907

166

0908

167

0909

168

090A

169

090B

170

090E

171

090F

172

0910

173

090D

174

0912

175

 

B_

 

0913

176

0914

177

0911

178

0915

179

0916

180

0917

181

0918

182

0919

183

091A

184

091B

185

091C

186

091D

187

091E

188

091F

189

0920

190

0921

191

 

C_

 

0922

192

0923

193

0924

194

0925

195

0926

196

0927

197

0928

198

0929

199

092A

200

092B

201

092C

202

092D

203

092E

204

092F

207

095F

206

0930

205

 

D_

 

0931

208

0932

209

0933

210

0934

211

0935

212

0936

213

0937

214

0938

215

0939

216

INV

217

093E

218

ि

093F

219

0940

220

0941

221

0942

222

0943

223

 

E_

 

0946

224

0947

225

0948

226

0945

227

094A

228

094B

229

094C

230

0949

231

094D

232

093C

233

0964

234

       

ATR

239

 

F_

 

EXT

240

0966

241

0967

242

0968

243

0969

244

096A

245

096B

246

096C

247

096D

248

096E

249

096F

250

 

Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world’s writing systems. Developed in conjunction with the Universal Character Set standard and published in book form as The Unicode Standard, the latest version of Unicode contains a repertoire of more than 110,000 characters covering 100 scripts and various symbols. The standard consists of a set of code charts for visual reference, an encoding method and set of standard character encodings, a set of reference data computer files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional display order (for the correct display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts). As of June 2014, the most recent version is Unicode 7.0. The standard is maintained by the Unicode Consortium.