apache-commons-csv
Introduction to Apache Commons CSV
1. Overview
Apache Commons CSV library has many useful features for creating and reading CSV files.
In this quick tutorial, we’ll see how to utilize this library by showing a simple example.
2. Maven Dependency
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.4</version>
</dependency>
To check for the most recent version of this library – go here.
3. Reading a CSV File
author,title
Dan Simmons,Hyperion
Douglas Adams,The Hitchhiker's Guide to the Galaxy
Let’s see how we can read it:
Map<String, String> AUTHOR_BOOK_MAP = new HashMap<>() {
{
put("Dan Simmons", "Hyperion");
put("Douglas Adams", "The Hitchhiker's Guide to the Galaxy");
}
});
String[] HEADERS = { "author", "title"};
@Test
public void givenCSVFile_whenRead_thenContentsAsExpected() throws IOException {
Reader in = new FileReader("book.csv");
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withHeader(HEADERS)
.withFirstRecordAsHeader()
.parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
assertEquals(AUTHOR_BOOK_MAP.get(author), title);
}
}
We are reading the records of a CSV file after skipping the first line as it is the header.
There are different types of CSVFormat specifying the format of the CSV file, an example of which you can see in the next paragraph.
4. Creating a CSV File
public void createCSVFile() throws IOException {
FileWriter out = new FileWriter("book_new.csv");
try (CSVPrinter printer = new CSVPrinter(out, CSVFormat.DEFAULT
.withHeader(HEADERS))) {
AUTHOR_BOOK_MAP.forEach((author, title) -> {
printer.printRecord(author, title);
});
}
}
The new CSV file will be created with the appropriate headers because we have specified them in our CSVFormat declaration.
5. Headers & Reading Columns
There are different ways to read and write headers. Similarly, there are different ways to read column values.
Let’s go through them one by one:
5.1. Accessing Columns by Index
This is the most basic way to read column values. This can be used when the headers for the CSV files are not known:
Reader in = new FileReader("book.csv");
Iterable<CSVRecord> records = CSVFormat.DEFAULT.parse(in);
for (CSVRecord record : records) {
String columnOne = record.get(0);
String columnTwo = record.get(1);
}
5.2. Accessing Columns by Predefined Headers
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withHeader("author", "title").parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
}
5.3. Using Enums as Headers
Using Strings for accessing column values can be error-prone. Using Enums instead of Strings will make the code more standardized and easier to understand:
enum BookHeaders {
author, title
}
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withHeader(BookHeaders.class).parse(in);
for (CSVRecord record : records) {
String author = record.get(BookHeaders.author);
String title = record.get(BookHeaders.title);
}
5.4. Skipping the Header Line
Usually, CSV files contain headers in the first line. Hence, in most cases, it is safe to skip it and start reading from the second row.
This will autodetect headers access column values:
Iterable<CSVRecord> records = CSVFormat.DEFAULT
.withFirstRowAsHeader().parse(in);
for (CSVRecord record : records) {
String author = record.get("author");
String title = record.get("title");
}
6. Conclusion
We presented the use of Apache’s Commons CSV library through a simple example. You can read more about the library here.
The code for this article is available over on Github.