
Excel2CSV: Convert MS Excel Worksheets to CSV
Posted by: Informatica Professional Services
Java utility that generates csv files by continuous scanning an input directory for MS Excel files. It is supported on Windows, Linux, AIX, Solaris Operating systems.
Overview

ExcelSheet2Csv polls a directory and reads all excel files as and when available. It converts each sheet in excel file to delimited csv files. It is supported on Windows, Linux, AIX, Solaris Operating system. It supports all excel file versions. Processing steps:
- Polls a given location for any excel files. Polling can be configured based on user requirements and this utility will run as a service until terminated explicitly. The default polling duration is 120 seconds (2 minutes).
- Whenever a new excel is dropped in the folder, the utility will create a folder and name it according to excel file name and extension. Then it will generate a csv file for each sheet in the excel file.
- Utility makes sure that it doesn?t extract same file twice.
- Use latest version of the utility (attached). Please remove .txt from its extension.
- Copy attached date-format properties file (dataFormat.properties) to /some/location/on/client/machine/dataFormat.properties of client machine
- Pass -dtf /some/location/on/client/machine/dataFormat.properties as additional parameter to regular command.
- For e.g.java -jar excel2csv_2.1.jar -delim "|" -efd C:\Users\abcd\work\MyWork\Excel2CSV -force -tq D -wt 0 -dtf C:\Users\abcd\workspaces\ExcelToCsvConverter\ExcelToCsvConverter\utils\dateFormat.properties
- -delim <delimiter>: Delimiter for the output file. User is free to specify any delimiter enclosed by double quotes. E.g. ?,?, ?|?
- -efd <excel_dir_full_path>: Full path of excel file directory. Utility will poll for new excel files in this location. Also, files are extracted in this location.
- ?force: Utility makes sure that same excel file is not extracted again. However, in some scenarios it becomes necessary to re-extract all files. To override default behavior, user can use this option.
- ?infinite: Utility is capable of polling for an infinite duration. User can implement infinite polling using this option.
- -wt <time_in_seconds>: It is an alternate to infinite polling. User can provide time, in seconds. Utility will poll for new excel files based on this number. By default, utility polls for 120 seconds.
- Extraction of selected sheets.
- For this, users might want to use -idx option.
- End users can pass sheet indices like this -> -idx 0 2 3 4 OR -idx 0-4 5 6.
- In first case, 1st, 3rd, 4th, and 5th sheet will be extracted. In second case, first 5 sheet (0 to 4), 6th and 7th sheets will be extracted
- Extracted files to be placed in output directory of user's choice instead of default directory.
- For this, users might want to use -ofd option.
- E.g. -ofd /some/location/on/disk.
- Note: workbooks will still be extracted under their folders (as earlier) i.e /some/location/on/disk/xlsx_workbook/sheet1.csv
- File names for generated files.
- -sfn can be used to achieve this.
- Syntax : -sfn key1=value1 key2=value2 ... keyN=valueN. Where key = sheet-index (0-based index) and value = file name for that sheet.
Features
Support for xls (1997-2003) and xlsx (newer version) excel files as input.Support for user defined output delimiter. E.g. ?,?, ?|? etc.Support for infinite polling. With infinite polling in place, user need not intervene in extraction process and hence making it more automatable. Support for timed waiting. This is an alternative to infinite polling.Utility make sure that same file is not extracted twice and hence saving valuable time in case the to-be-extracted excel file is huge. However, this behavior can be overridden by user. If you want to re-extract same file again, use ?force option to do the same.- OS Support: All OS
- Excel format support: xls (97-2003) and xlsx formats.
- Minimum Java Version: 1.6.0.45 onwards.