Converting XLS to SQL via CSV

reinier maliepaard · Post by **reinier maliepaard** » Thu Jun 30, 2016 10:04 pm

Hello,

Last week I had to put more than 270 EXCEL spreadsheets into SQL data records. All the XLS files had the same number of rows, corresponding to a fixed format, but the number of the columns could be different -random within a range 2-11. The first step was converting XLS to comma-separated values files (CSV), using a command line tool xls2csv. For the second step, reading CSV files and writing data to several SQL data records, Neobook did a very good job. For your info, I'll describe both steps.

Step 1. xls2csv
The program xls2csv converts EXCEL spreadsheet into a CSV file. It extracts only data without any formatting info and formulas. On Linux you have to install the program catdoc: xls2csv is part of it. A Windows binary can be found on http://blog.brush.co.nz/2009/09/catdoc-windows/

To convert only one file, use the following command after the prompt:

Code: Select all

xls2csv -x "reinier.xls" > "reinier_csv.csv"

To have more control use the parameters -s (source) and -d (destination). These are used to specify the character encoding for the source and for the destination file. Here cp1252 refers to Microsoft character encoding and 8859-1 to Western European character encoding.

Code: Select all

xls2csv -x "reinier.xls" -s cp1252 -d 8859-1 > "reinier_csv.csv"

If your file contains dates, use the parameter -f including a date format string.

Code: Select all

xls2csv -x -fdd/mm/yyyy "reinier.xls" -s cp1252 -d 8859-1 > "reinier_csv.csv"

To convert more than one file, use in Linux the following bash script:

Code: Select all

#!/bin/bash
for f in *.xls; do
    xls2csv -x "$f" -s cp1252 -d 8859-1 > "${f%.xls}.csv"
done

In Windows you could use the following command line

Code: Select all

FORFILES /M *xls /C "cmd /c xls2csv -x @file -s cp1252 -d 8859-1 > @fname.csv"

For information on the switches of xls2cvs: http://linux.die.net/man/1/xls2csv

Step 2. Neobook
In the CSV file, each line represents a specific row and columns of the original XLS file. In other words, each CSV line contains the cell data of the XLS spreadsheet, separated by "," (default). In my case, each line is read and parsed into an array. Each array element will be written to a specific file, that will contain at the end -after adding extra data- a number of SQL records.

Neobook is for this operation an excellent tool! And only a small number of commands are needed:

Code: Select all

.list all csv filenames in the current directory into the variable [mcFiles]
FileList "[PubDir]*.csv" "Files" "[mcFiles]"

.separate [mcFiles] into multiple parts using [#13] as delimiter.
StrParse "[mcFiles]" "[#13]" "[file]" "[noFiles]"

.transverse through all files
Loop "1" "[noFiles]" "[z]"
 
 .current file
 SetVar "[mcFile]" "[PubDir][file[z]]"
 
 .determine how many lines the file contains
 FileLen "[mcFile]" "[mcLen]"
 
 Loop "1" "[mcLen]" "[x]"
  .read each line
  FileRead "[mcFile]" "[x]" "[line]"
  .do some processing on each line and use FileWrite to write the results to the destination files.
 EndLoop

EndLoop

Happy programming!

Best regards,
Reinier

Post by **Neosoft Support** » Sat Jul 02, 2016 5:25 pm

Interesting Article! Thank you for sharing this with us.

Archived NeoSoft Support Forum

Converting XLS to SQL via CSV

Converting XLS to SQL via CSV

Re: Converting XLS to SQL via CSV