Last week I had to put more than 270 EXCEL spreadsheets into SQL data records. All the XLS files had the same number of rows, corresponding to a fixed format, but the number of the columns could be different -random within a range 2-11. The first step was converting XLS to comma-separated values files (CSV), using a command line tool xls2csv. For the second step, reading CSV files and writing data to several SQL data records, Neobook did a very good job. For your info, I'll describe both steps.
Step 1. xls2csv
The program xls2csv converts EXCEL spreadsheet into a CSV file. It extracts only data without any formatting info and formulas. On Linux you have to install the program catdoc: xls2csv is part of it. A Windows binary can be found on http://blog.brush.co.nz/2009/09/catdoc-windows/
To convert only one file, use the following command after the prompt:
Code: Select all
xls2csv -x "reinier.xls" > "reinier_csv.csv"
Code: Select all
xls2csv -x "reinier.xls" -s cp1252 -d 8859-1 > "reinier_csv.csv"
Code: Select all
xls2csv -x -fdd/mm/yyyy "reinier.xls" -s cp1252 -d 8859-1 > "reinier_csv.csv"
Code: Select all
#!/bin/bash
for f in *.xls; do
xls2csv -x "$f" -s cp1252 -d 8859-1 > "${f%.xls}.csv"
done
Code: Select all
FORFILES /M *xls /C "cmd /c xls2csv -x @file -s cp1252 -d 8859-1 > @fname.csv"
Step 2. Neobook
In the CSV file, each line represents a specific row and columns of the original XLS file. In other words, each CSV line contains the cell data of the XLS spreadsheet, separated by "," (default). In my case, each line is read and parsed into an array. Each array element will be written to a specific file, that will contain at the end -after adding extra data- a number of SQL records.
Neobook is for this operation an excellent tool! And only a small number of commands are needed:
Code: Select all
.list all csv filenames in the current directory into the variable [mcFiles]
FileList "[PubDir]*.csv" "Files" "[mcFiles]"
.separate [mcFiles] into multiple parts using [#13] as delimiter.
StrParse "[mcFiles]" "[#13]" "[file]" "[noFiles]"
.transverse through all files
Loop "1" "[noFiles]" "[z]"
.current file
SetVar "[mcFile]" "[PubDir][file[z]]"
.determine how many lines the file contains
FileLen "[mcFile]" "[mcLen]"
Loop "1" "[mcLen]" "[x]"
.read each line
FileRead "[mcFile]" "[x]" "[line]"
.do some processing on each line and use FileWrite to write the results to the destination files.
EndLoop
EndLoop
Best regards,
Reinier