Converting Fixed-Width Text Files To CSV in C++ (and for free)

C++ Logo

Large Padded Data

A recent data acquisition brought forth the requirement to process fixed-width text files that comprise the data. This would not have been much of a discussion point were it not for the fact that some of the files were huge – 60Gb in one case. Most of these large files comprise the space character, serving as padding for the fixed-width fields; this serves to illustrate how inefficient fixed-width text files are, but that is not the point we’re making here today.

Converting Using Existing Applications

We decided to convert each file into a CSV file, which can easily be read, edited and loaded into a database. There are applications that are able to convert massive text files to CSV, but during our brief trial of a few programs we found:-

  • One application was unreliable (crashed)
  • One application was buggy (unable to handle commas and/or quotes)
  • One application was expensive (several hundred dollars)

… and anyway, programming your own is much more fun. We created a C++ application in order to process these large files as quickly as possible.

Our C++ Application

We created a C++ command line application called texttocsv. We complied it as a Windows 32 executable, but the code is ANSI C++, used no Windows API code, uses no other libraries, and will compile for other operating systems.

texttocsv can read an 8 bit character fixed with text file (there is no support for Unicode) and quickly convert each row to CSV. It will enclose each field in quotes only where needed, and correctly escape quotes within fields.

A Short Example

Consider the following fixed-width text file (test.txt) with two fixed columns of 20 and 50 characters:-

Jupiter             A planet                 
Andromeda           A "nearby" galaxy
Sirius              A star
Eros                An asteroid, a rocky body
Titan               A moon

…we used texttocsv.exe to create the following CSV (test.csv):-

Jupiter,A planet
Andromeda,"A ""nearby"" galaxy"
Sirius,A star
Eros,"An asteroid, a rocky body"
Titan,A moon

Note that the “Andromeda” row has the quotes around “nearby” correctly escaped.

How To Use

The parameters for texttocsv are:-

  1. Destination file
  2. Source file
  3. List of column widths in the source file, separated by a comma
  4. Start column number, 0 based. This parameter is optional.

The example above was created with the following command:-

texttocsv.exe test.csv test.txt 20,50

Our C++ Source Code

Introduction

The application was created in ANSI C++. We followed the RAII programming technique, creating classes whose destructors release their resources, and which cannot be created using new.

Execute Sequence

The program is entered at main, and receives the parameters as entered by the user. The paramaters are read, and any missing paramters are reported to the user. If all mandatory parameters are present, an instance of CMyProcess (which is derived from CTextToCSV) is created on the stack. The member function Process is executed, and the returned error code is checked and reported.

CTextToCSV Class

The processing is done using an instance of a class derived from CTextToCSV. The derived class should include the method Progress which is invoked for each row processed; our application simply reports to console every 10000 rows, but if the class was used in a GUI, a progress bar could be displayed. The class must be created on the stack – new is not permitted.

CTextToCSV includes some error codes in an enum named ErrCode. The project code should normally interpret the errors and display to the user, as our simple application does.

CTextToCSV will close any open files in its destructor, and the instances of the other classes used (CCharBuffer and CWidthList) free their allocated buffers in their own destructors. Each class instance is created on the stack, guaranteeing that any resources are released.

The classes include buffer overwrite protection. Memory consumption is modest and is proportional to total width of the fixed text file, and memory allocation errors (if the system is short on memory) are handled, returning the ErrCode value OutOfMem.

All memory allocated is freed by the class destructors.

The Code

#include
#include
#include

//class to hold the array of column widths
class CWidthList
{
private:
   unsigned int* m_Column_Width;
   unsigned int m_Num_Cols;
   unsigned int m_Total_Width;

   //prevent new from being used - force
   //any instance to be on the stack
   void * operator new (size_t);
   void * operator new[] (size_t);

public:
   //return width of column number (zero based)
   //returns 0 if column number was invalid
   unsigned int GetWidth(unsigned int column_number)
   {
      if(column_number>=0 && column_numberm_Column_Width[column_number]);
      }

      return 0;
   }

   //return total width of columns
   unsigned int GetTotalWidth(void)
   {
      return(this->m_Total_Width);
   }

   //return the number of columns
   unsigned int GetNumCols(void)
   {
      return(this->m_Num_Cols);
   }

   //parse the widths string
   //returns false if out of memory
   bool ParseWidths(const char* p_widths)
   {
      unsigned int total = 0;

      unsigned int i=0;
      unsigned int len=strlen(p_widths);

      //count number of commas in the width string
      this->m_Num_Cols=1;
      while(im_Num_Cols++;
         }
         i++;
      }

      //create the buffer for the column widths

      try
      {
         this->m_Column_Width=new unsigned int[this->m_Num_Cols];
      }
      catch (std::bad_alloc)
      {
         //out of memory for the buffer
         return false;
      }

      //set all widths to 0
      i=0;
      while(i      {
         m_Column_Width[i]=0;
         i++;
      }

      char val[8];
      int valpos=0;
      int w;
      i=0;
      unsigned int cur_col=0;
      while(im_Column_Width[cur_col]=w;
            total+=w;
            cur_col++;
            valpos=0;
         }
         else if(c>='0' && cm_Column_Width[cur_col] = w;
      total += w;
      this->m_Total_Width=total;

      return true;
   }

};

//simple 8 bit character buffer class
class CCharBuffer
{
private:

   char* m_Data;
   unsigned int m_Max_Chars;      //maximum number of
                                  //chars allowed
   unsigned int m_Pos;            //current write pos

   //prevent new from being used -
   //force any instance to be on the stack
   void * operator new (size_t);
   void * operator new[] (size_t);

public:

   //constructor
   CCharBuffer()
   {
      try
      {
         //allocate num chars plus an extra
         this->m_Max_Chars = 32;
         this->m_Data=new char[this->m_Max_Chars+1];
      }
      catch (std::bad_alloc)
      {
         //out of memory when creating the buffer
         //so mark the buffer as not created
         this->m_Data=NULL;
         this->m_Max_Chars = 0;
      }

      this->m_Pos=0;
   }

   //destructor
   ~CCharBuffer()
   {
      //free the allocated buffer
      if(this->m_Data!=NULL)
      {
         delete this->m_Data;
         this->m_Data=NULL;
      }
   }

   //return false if failed to allocate a buffer
   bool CheckSpace(const unsigned int num_chars)
   {
      if(num_chars m_Max_Chars)
      {
         return true;
      }

      char* new_dest = NULL;

      try
      {
         //allocate num chars plus an extra
         new_dest = new char[num_chars+1];
      }
      catch (std::bad_alloc)
      {
         //out of memory when resizing destination buffer
         return false;
      }

      if(this->m_Pos>0 && this->m_Data!=NULL)
      {
         //copy the existing data into the new buffer
         memcpy(new_dest,this->m_Data,this->m_Pos);
      }

      if(this->m_Data!=NULL)
      {
         delete this->m_Data;      //delete the OLD buffer
                                   //(if it existed)
      }

      this->m_Data=new_dest;         //and use the new one
      this->m_Max_Chars=num_chars;

      return true;
   }

   //specify the current position
   void SetPos(const unsigned int val)
   {
      this->m_Pos=val;
      //ensure poos
      if(valm_Pos=0;
      }
      else if(val>=this->m_Max_Chars)
      {
         this->m_Pos=this->m_Max_Chars-1;
      }
   }

   const unsigned int GetPos(void)
   {
      return this->m_Pos;
   }

   //make space and add a character
   //returns false if failed to make space
   bool Add(const char c)
   {
      if(CheckSpace(this->m_Pos+1)==false)
      {
         return false;
      }

      this->m_Data[this->m_Pos]=c;
      this->m_Pos++;

      return true;
   }

   //make space and add a string
   bool Add(const char* src,const unsigned int num_chars)
   {
      if(CheckSpace(this->m_Pos+num_chars)==false)
      {
         return false;
      }

      memcpy(this->m_Data+this->m_Pos,src,num_chars);
      this->m_Pos+=num_chars;

      return true;
   }

   //read pointer to the buffer - only valid while the
   //instance is in scope
   char* Read(void)
   {
      return this->m_Data;
   }
};

//class to process a fixed width text file into a CSV
class CTextToCSV
{
public:

   //error codes
   enum ErrCode
   {
      None = 0,
      OutOfMem,
      FileNotFound,
      FileOpenForWriteFailed
   };

private:
//members
   FILE* m_Dest;            //destination (CSV) file
   FILE* m_Src;            //source text file

   CCharBuffer m_Src_Buffer;
   CCharBuffer m_Dest_Buffer;
   CWidthList m_Width;

   unsigned int m_Start_Col;   //start column (optional)

protected:
   //prevent new from being used - force any
   //instance to be on the stack
   void * operator new (size_t);
   void * operator new[] (size_t);

//private nethods
private:

   //read field into m_Dest_Buffer
   //returns OutOfMem if failed to resize dest buffer
   //
   ErrCode ReadField(const int curpos,const int width)
   {
   //first, scan src to get the trimmed extents
   //and to discover if comma is present

      int start=curpos;
      int end=curpos+width;

      //read
      const char* src_buf=this->m_Src_Buffer.Read();

      while(startstart)
      {
         if(src_buf[end]!=0x20)
         {
            //non space found
            break;
         }
         end--;
      }
      //start and end are inclusive

      bool enclose_in_commas = false;
      int i=start;
      while(im_Dest_Buffer.Add(
			 src_buf+start,bytes_to_copy)==false)
         {
            //insufficient space in the destination buffer
            return OutOfMem;
         }
      }
      else
      {
         //enclose in quotes and escape any double quote character

         //add opening quotes
         if(this->m_Dest_Buffer.Add('"')==false)
         {
            //insufficient space in the destination buffer
            return OutOfMem;
         }

         //copy all characters and escape any double quote
         while(startm_Dest_Buffer.Add('"');
               this->m_Dest_Buffer.Add('"');
            }
            else
            {
               //simply add the character
               if(this->m_Dest_Buffer.Add(src_buf[start])==false)
               {
                  //out of memory
                  return OutOfMem;
               }
            }
            start++;
         }

         //add closing quotes
         if(this->m_Dest_Buffer.Add('"')==false)
         {
            return OutOfMem;
         }
      }
      return None;
   }

   //process each row
   //returns ErrCode (normally None)
   ErrCode ProcessRow(void)
   {

      this->m_Dest_Buffer.SetPos(0);

      //if a CR is found, terminate the src buffer before it
      char* src_buf=this->m_Src_Buffer.Read();

      char* sp=strstr(src_buf,"\r");
      if(sp!=NULL)
      {
         sp[0]=0;
      }

      //pad the src buffer with spaces
      int len=strlen(src_buf);
      int pad_len=this->m_Width.GetTotalWidth()-len;
      if(pad_len>0)
      {
         //pad with spaces
         sp=src_buf + len;
         memset(sp,0x20,pad_len);
      }

      //read each field
      unsigned int x=0;
      int curpos=0;

      if(this->m_Start_Col>0 &&
      this->m_Start_Colm_Width.GetNumCols())
      {
         //specified start column is valid
         while(x < this->m_Start_Col)
         {
            curpos += this->m_Width.GetWidth(x);
            x++;
         }
      }

      while(x < this->m_Width.GetNumCols())
      {
         if(ReadField(curpos,
            this->m_Width.GetWidth(x))==OutOfMem)
         {
            //failed to read a field due to memory failure
            return OutOfMem;
         }

         x++;

         //add a comma UNLESS this is the last field
         if(xm_Width.GetNumCols())
         {
            if(this->m_Dest_Buffer.Add(',')==false)
            {
            //insufficient space in the destination buffer
               return OutOfMem;
            }
            //add the width of previous column
            curpos += this->m_Width.GetWidth(x-1);

         }

      }

      fwrite(this->m_Dest_Buffer.Read(),
             1,
             this->m_Dest_Buffer.GetPos(),
             this->m_Dest);

      fwrite("\r\n",1,2,this->m_Dest);

      return None;

   }

   //close files
   void Close(void)
   {
      //close src file
      if(this->m_Src!=NULL)
      {
         fclose(this->m_Src);
         this->m_Src=NULL;
      }

      //close dest file
      if(this->m_Dest!=NULL)
      {
         fclose(this->m_Dest);
         this->m_Dest=NULL;
      }

   }

protected:
   //process progress report as each row is read
   virtual void Progress(unsigned int row_num)
   {
   }

public:
   //constructor
   CTextToCSV()
   {
      this->m_Dest = NULL;
      this->m_Src = NULL;
   }

   //destructor
   ~CTextToCSV()
   {
      //close open files and free allocated buffers
      Close();
   }

   ErrCode Process(const char* p_dest_file,
                   const char* p_src_file,
                   const char* p_widths,
                   const char* p_start_col)
   {
      this->m_Start_Col=0;

      //parse the widths string
      if(this->m_Width.ParseWidths(p_widths)==false)
      {
         return OutOfMem;
      }

      //ensure the src buffer has
      //sufficient space to read total width
      if(this->m_Src_Buffer.CheckSpace(
		  this->m_Width.GetTotalWidth()*2)==false)
		  //ensure src buffer min size
      {
         return OutOfMem;
      }

      //open source file
      this->m_Src=fopen(p_src_file,"rb");
      if(this->m_Src==NULL)
      {
         //failed to open src file

         return FileNotFound;
      }

      //open destination file
      this->m_Dest=fopen(p_dest_file,"wb");
      if(this->m_Dest==NULL)
      {
         //failed to open dest file
         return FileOpenForWriteFailed;
      }

      //read start column number if set
      if(p_start_col!=NULL)
      {
         this->m_Start_Col=atoi(p_start_col);
      }

      unsigned int row=0;      //row counter
      while(1==1)
      {
         void* result=fgets(this->m_Src_Buffer.Read(),
                            this->m_Width.GetTotalWidth()*2,
                            this->m_Src);
         if(result==NULL)
         {
            break;
         }
         row++;

         ErrCode err = ProcessRow();
         if(err!=None)
         {
            return err;
         }

         Progress(row);

      }

      //and close files and buffers
      Close();

      return None;
   }

};

//class derived from CTextToCSV to allow
//bespoke progress handling
class MyProcess : public CTextToCSV
{
public:

protected:
   //process progress report as each row is read
   void Progress(unsigned int row_num)
   {
      if((row_num%10000)==0)
      {
         printf("Row %d\r\n",row_num);
      }
   }

};

int main(int argc, char* argv[])
{
   printf("TextToCSV Version 1.0.0.1 (c) 2014\r\n\r\n");

   //read params
   int num_param=argc;
   if(num_param   {
      printf("parameters:-\r\n\r\n");
      printf("dest filename (e.g. mydata.csv)\r\n");
      printf("source filename (e.g. mydata.txt\r\n");
      printf("column widths (e.g. 10,10,20,30,50\r\n");
      printf("start column position (optional, 0 based)\r\n");

      return(0);
   }

   const char* src=NULL;
   const char* dest=NULL;
   const char* widths=NULL;
   const char* start_col=NULL;

   int i=1;
   while(i   {
      const char* pr=(const char*)argv[i];
      //assign each paramater
      if(pr)
      {
         if(dest==NULL)
         {
            dest=pr;
         }
         else if(src==NULL)
         {
            src = pr;
         }
         else if(widths==NULL)
         {
            widths = pr;
         }
         else if(start_col==NULL)
         {
            start_col = pr;
         }
      }
      i++;
   }

   if(src == NULL)
   {
      printf("Missing source filename");
      return -1;
   }
   if(dest == NULL)
   {
      printf("Missing dest filename");
      return -1;
   }
   if(widths == NULL)
   {
      printf("Missing column widths");
      return -1;
   }

   printf("Processing file %s into file %s\r\n\r\n",src,dest);

   //an instance of our class, derived from CTextToCSV
   //note that this instance is created on the stack which is simpler and
   //safer than using new and delete.
   //
   MyProcess curpos;

   //and process the file
   MyProcess::ErrCode err = curpos.Process(dest,src,widths,start_col);

   //read error code if and print a report to console
   if(err!=MyProcess::None)
   {
      switch(err)
      {
         case MyProcess::OutOfMem:
            printf("Error: out of memory\r\n\r\n");
            break;
         case MyProcess::FileNotFound:
            printf("Error: source file not found\r\n\r\n");
            break;
         case MyProcess::FileOpenForWriteFailed:
            printf("Error: unable to open destination file\r\n\r\n");
            break;

         default:
            break;

      }
   }
   else
   {
      printf("process completed, no errors\r\n");

   }

   return 0;
}

Normalising Nationalities (via a good ISO-3166 Country List)

A recent development has seen the acquisition of some very large data sets which contain a “nationality” column. Unfortunately the contents of that column are inconsistent; sometimes the country name is used, sometimes the nationality (or demonym) is mis-spelled. We decided therefore to normalise the nationality columns, and to do so by converting them to the two character ISO-3166 country code list; common codes are GB, US, AU, CA, CN, ES etc.

Unfortunately many of the lists available are incomplete, out of date or have errors, We have therefore compiled a new ISO-3166 list which includes, where possible, up to three demonyms for each row. These include some common mis-spellings as well as the genuine alternatives. The data referred to above includes the unpleasant sounding nationality “Turk” as well as “Turkish”, so both denonyms are included in the list for the sake of completeness.

The latest changes to the world’s countries are represented here, including creation of South Sudan, and the separation of Serbia from Montenegro. Burma was renamed by its government 25 years ago, but the old name is still commonly used by news agencies and other bodies, so our list gives the official name in brackets after the common name. We’d also like to say a big hello to the people of The Republic of the Union of Myanmar today.

The table includes some rows which are not sovereign countries, such as Guam or Jersey. The table is sorted by ISO-3166 code.

We have sucessfully used this table to normalise the nationality data in the aforementioned large data sets, and the new column is a simple 2 latin character code, which represents a worthwhile space saving excercise in our large database, in addition to the new consistency of the data.

One problem with this approach is the occasional duplication of denonyms which would prevent the remapping of nationalities back from the ISO-3166 country code; look at The Virgin Islands (both countries) whose inhabitants are both described as “Virgin Islanders”. This description fails to clarify to which Virgin Islands the person belongs, so it is insufficient to accurately determine the ISO-3166 code anyway. We have special cased our normalisation in this instance.

The ISO-3166 CSV is available to download here, and the table is shown below:-

Code Name Demonym 1 Demonym 2 Demonym 3
AD Andorra Andorran
AE United Arab Emirates Emirian Emirati
AF Afghanistan Afghani Afghan
AG Antigua and Barbuda Antiguan
AI Anguilla Anguillan
AL Albania Albanian Alabanian
AM Armenia Armenian Hayastani
AO Angola Angolan
AQ Antarctica Antarctic
AR Argentina Argentine Argentinian Argentinean
AS American Samoa Samoan
AT Austria Austrian
AU Australia Australian
AW Aruba Arubian
AX Åland Islands Ålandic Ålandish
AZ Azerbaijan Azerbaijani
BA Bosnia and Herzegovina Bosnian Herzegovinian
BB Barbados Barbadian Barbadan Bajan
BD Bangladesh Bangladeshi
BE Belgium Belgian
BF Burkina Faso Burkinabe
BG Bulgaria Bulgarian
BH Bahrain Bahrainian
BI Burundi Burundian
BJ Benin Beninese
BL Saint Barthélemy Barthélemois
BM Bermuda Bermudan
BN Brunei Bruneian
BO Bolivia Bolivian
BQ Caribbean Netherlands
BR Brazil Brazilian
BS Bahamas Bahameese Bahamian
BT Bhutan Bhutanese
BV Bouvet Island
BW Botswana Motswana Batswana
BY Belarus Belarusian
BZ Belize Belizean
CA Canada Canadian
CC Cocos (Keeling) Islands Cocossian Cocos Islandia
CD Democratic Republic of the Congo Congolese
CF Central African Republic Central African
CG Congo (Republic of) Congolese
CH Switzerland Swiss
CI Côte d’Ivoire (Ivory Coast) Ivorian
CK Cook Islands Cook Islander
CL Chile Chilean
CM Cameroon Cameroonian
CN China Chinese
CO Colombia Colombian Columbian
CR Costa Rica Costa Rican
CU Cuba Cuban
CV Cape Verde Cape Verdean
CW Curaçao Curaçaoan
CX Christmas Island Christmas Islander
CY Cyprus Cypriot
CZ Czech Republic Czech
DE Germany German
DJ Djibouti Djiboutian Djibouti
DK Denmark Danish Dane
DM Dominica Dominican
DO Dominican Republic Dominican
DZ Algeria Algerian
EC Ecuador Ecuadorean Ecudorean
EE Estonia Estonian
EG Egypt Egyptian
EH Western Saharan Western Saharan Sahrawi
ER Eritrea Eritrean
ES Spain Spanish
ET Ethiopia Ethiopian
FI Finland Finnish
FJ Fiji Fijian
FK Falkland Islands Falkland Islander
FM Micronesia Micronesian
FO Faroe Islands Faroese
FR France French
GA Gabon Gabonese
GB United Kingdom British
GD Grenada Grenadian
GE Georgia Georgian
GF French Guiana French Guianese
GG Guernsey
GH Ghana Ghanaian Ghanian
GI Gibralter Gibralterian
GL Greenland Greenlander Greenlandic
GM Gambia Gambian
GN Guinea Guinean
GP Guadeloupe Guadeloupean
GQ Equatorial Guinea Equatorial Guinean Equatoguinean
GR Greece Greek
GS South Georgia and the South Sandwich Islands
GT Guatemala Guatemalan
GU Guam Guamanian
GW Guinea-Bissau Guinean
GY Guyana Guyanese
HK Hong Kong Hong Konger
HM Heard and McDonald Islands
HN Honduras Honduran
HR Croatia Croatian Croat
HT Haiti Haitian
HU Hungary Hungarian
ID Indonesia Indonesian
IE Ireland Irish
IL Israel Israeli
IM Isle of Man Manx
IN India Indian
IO British Indian Ocean Territory
IQ Iraq Iraqi
IR Iran Iranian
IS Iceland Icelander
IT Italy Italian
JE Jersey
JM Jamaica Jamaican
JO Jordan Jordanian
JP Japan Japanese
KE Kenya Kenyan
KG Kyrgyzstan Kyrgyzstani
KH Cambodia Cambodian
KI Kiribati I-Kiribati
KM Comoros Comoran
KN Saint Kitts and Nevis Kittian Nevisian
KP North Korea North Korean
KR South Korea South Korean
KW Kuwait Kuwaiti
KY Cayman Islands Caymanian
KZ Kazakhstan Kazakhstani Kazakh
LA Laos Laotian
LB Lebanon Lebanese
LC Saint Lucia Saint Lucian
LI Liechtenstein Liechtensteiner
LK Sri Lanka Sri Lankan
LR Liberia Liberian
LS Lesotho Mosotho Basotho
LT Lithuania Lithunian
LU Luxembourg Luxembourger
LV Latvia Latvian
LY Libya Libyan
MA Morocco Moroccan
MC Monaco Monacan
MD Moldova Moldovan
ME Montenegro Montenegrin
MF Saint Martin (France)
MG Madagascar Malagasy
MH Marshall Islands Marshallese
MK Macedonia Macedonian
ML Mali Malian
MM Burma (Republic of the Union of Myanmar) Myanmarese Burmese
MN Mongolia Mongolian
MO Macau Macanese
MP Northern Mariana Islands Northern Mariana Islander
MQ Martinique Martinican Martiniquaís
MR Mauritania Mauritanian
MS Montserrat Montserratian
MT Malta Maltese
MU Mauritius Mauritian
MV Maldives Maldivan
MW Malawi Malawian
MX Mexico Mexican
MY Malaysia Malaysian
MZ Mozambique Mozambican
NA Namibia Namibian
NC New Caledonia New Caledonian New Caledonians
NE Niger Nigerien
NF Norfolk Island Norfolk Islander
NG Nigeria Nigerian
NI Nicaragua Nicaraguan Nicoya
NL Netherlands Dutch
NO Norway Norwegian
NP Nepal Nepalese
NR Nauru Nauruan
NU Niue Niuean
NZ New Zealand New Zealander
OM Oman Omani
PA Panama Panamanian
PE Peru Peruvian
PF French Polynesia French Polynesian
PG Papua New Guinea Papua New Guinean
PH Philippines Filipino
PK Pakistan Pakistani
PL Poland Polish Pole
PM St. Pierre and Miquelon Saint-Pierrais Miquelonnais
PN Pitcairn Pitcairn Islander
PR Puerto Rico Puerto Rican
PS Palestine Palestinian
PT Portugal Portuguese Portugese
PW Palau Palauan
PY Paraguay Paraguayan
QA Qatar Qatari
RE Réunion
RO Romania Romanian
RS Serbia Serbian Serb
RU Russian Federation Russian
RW Rwanda Rwandan Rwandese
SA Saudi Arabia Saudi Arabian Saudi
SB Solomon Islands Solomon Islander
SC Seychelles Seychellois
SD Sudan Sudanese
SE Sweden Swedish Swede
SG Singapore Singaporean
SH Saint Helena Saint Helenian
SI Slovenia Slovenian Slovene
SJ Svalbard and Jan Mayen Islands
SK Slovakia Slovakian Slovak
SL Sierra Leone Sierra Leonean
SM San Marino Sanmarinese Sammarinese
SN Senegal Senegalese
SO Somalia Somali
SR Suriname Surinamer Surinamese
SS South Sudan Sudanese
ST São Tome and Príncipe São Tomean Sao Tomean
SV El Salvador Salvadorean Salvadoran
SX Saint Martin (Netherlands)
SY Syria Syrian
SZ Swaziland Swazi
TC Turks and Caicos Islands Turks and Caicos Islander
TD Chad Chadian
TF French Southern Territories
TG Togo Togolese
TH Thailand Thai
TJ Tajikistan Tajikistani
TK Tokelau Tokelauan
TL Timor-Leste Timorese
TM Turkmenistan Turkmen
TN Tunisia Tunisian
TO Tonga Tongan
TR Turkey Turkish Turk
TT Trinidad and Tobago Trinidadian Tobagonian
TV Tuvalu Tuvaluan
TW Taiwan Taiwanese
TZ Tanzania Tanzanian
UA Ukraine Ukrainian
UG Uganda Ugandan
UM United States Minor Outlying Islands
US United States of America American
UY Uruguay Uruguayan
UZ Uzbekistan Uzbekistani
VA Vatican
VC Saint Vincent and Grenadines Saint Vincentian Vincentian
VE Venezuela Venezuelan
VG British Virgin Islands Virgin Islander
VI United States Virgin Islands Virgin Islander
VN Vietnam Vietnamese
VU Vanuatu Ni-Vanuatu
WF Wallis and Futuna Islands Wallisian Futunan
WS Samoa Samoan
YE Yemen Yemeni Yemenese
YT Mayotte Mahoran
ZA South Africa South African
ZM Zambia Zambian
ZW Zimbabwe Zimbabwean