Last Updated on 30 May 2019   |   Print Email
In this tutorial, I will share with you how to read Excel files programmatically using Java.You know, Excel is the very popular file format created by Microsoft. Although it is not an opened file format, Java applications can still read and write Excel files using the Apache POI - the Java API for Microsoft Documents, because the development team uses reverse-engineering to understand the Excel file format. Hence the name POI stands for Poor Obfuscation Implementation.This tutorial shows you how simple and easy it is to read Excel files using Apache POI’s API.
1. Getting Apache POI library
Apache POI is the pure Java API for reading and writing Excel files in both formats XLS (Excel 2003 and earlier) and XLSX (Excel 2007 and later). To use Apache POI in your Java project:
For non-Maven projects:
Download the latest release of the library here: Apache POI - Download Release ArtifactsExtract the zip file and add the appropriate JAR files to your project’s classpath:- If you are reading and writing only Excel 2003 format, only the file poi-VERSION.jar is enough.- If you are reading and writing Excel 2007 format, you have to include the following files:
poi-ooxml-VERSION.jar
poi-ooxml-schemas-VERSION.jar
xmlbeans-VERSION.jar
For Maven projects: Add the following dependency to your project’s pom.xml file:
The latest stable version of Apache POI is 3.11 (at the time of writing this tutorial).
2. The Apache POI API Basics
There are two main prefixes which you will encounter when working with Apache POI:
HSSF: denotes the API is for working with Excel 2003 and earlier.
XSSF: denotes the API is for working with Excel 2007 and later.
And to get started the Apache POI API, you just need to understand and use the following 4 interfaces:
Workbook: high level representation of an Excel workbook. Concrete implementations are: HSSFWorkbookand XSSFWorkbook.
Sheet: high level representation of an Excel worksheet. Typical implementing classes are HSSFSheetand XSSFSheet.
Row: high level representation of a row in a spreadsheet. HSSFRowand XSSFRoware two concrete classes.
Cell: high level representation of a cell in a row. HSSFCelland XSSFCellare the typical implementing classes.
Now, let’s walk through some real-life examples.
3. Reading from Excel File Examples
Suppose you want to read an Excel file whose content looks like the following screenshot:This spreadsheet contains information about books (title, author and price).
A Simple Example to Read Excel File in Java
Here’s a dirty example that reads every cell in the first sheet of the workbook and prints out values in every cell, row by row:
package net.codejava.excel;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
/**
* A dirty simple program that reads an Excel file.
* @author www.codejava.net
*
*/
public class SimpleExcelReaderExample {
public static void main(String[] args) throws IOException {
String excelFilePath = "Books.xlsx";
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = firstSheet.iterator();
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
}
System.out.print(" - ");
}
System.out.println();
}
workbook.close();
inputStream.close();
}
}
Output:
Head First Java - Kathy Serria - 79.0 -
Effective Java - Joshua Bloch - 36.0 -
Clean Code - Robert Martin - 42.0 -
Thinking in Java - Bruce Eckel - 35.0 -
A More Object-Oriented Example to read Excel File
For nicer and more object-oriented program, let’s create a model class (Book.java) with the following code:
package net.codejava.excel;
public class Book {
private String title;
private String author;
private float price;
public Book() {
}
public String toString() {
return String.format("%s - %s - %f", title, author, price);
}
// getters and setters
}
Write a method that reads value of a cell as following:
private Object getCellValue(Cell cell) {
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
return cell.getStringCellValue();
case Cell.CELL_TYPE_BOOLEAN:
return cell.getBooleanCellValue();
case Cell.CELL_TYPE_NUMERIC:
return cell.getNumericCellValue();
}
return null;
}
Next, implement a method that reads an Excel file and returns a list of books:
public List<Book> readBooksFromExcelFile(String excelFilePath) throws IOException {
List<Book> listBooks = new ArrayList<>();
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = firstSheet.iterator();
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
Book aBook = new Book();
while (cellIterator.hasNext()) {
Cell nextCell = cellIterator.next();
int columnIndex = nextCell.getColumnIndex();
switch (columnIndex) {
case 1:
aBook.setTitle((String) getCellValue(nextCell));
break;
case 2:
aBook.setAuthor((String) getCellValue(nextCell));
break;
case 3:
aBook.setPrice((double) getCellValue(nextCell));
break;
}
}
listBooks.add(aBook);
}
workbook.close();
inputStream.close();
return listBooks;
}
[Head First Java - Kathy Serria - 79.000000, Effective Java - Joshua Bloch - 36.000000,
Clean Code - Robert Martin - 42.000000, Thinking in Java - Bruce Eckel - 35.000000]
How to Read both Excel 2003 and 2007 format in Java
For better supporting both users using Excel 2003 and 2007, it’s recommended to write a separate factory method that returns an XSSFWorkbookor HSSFWorkbookdepending on the file extension of the file (.xls or .xlsx):
private Workbook getWorkbook(FileInputStream inputStream, String excelFilePath)
throws IOException {
Workbook workbook = null;
if (excelFilePath.endsWith("xlsx")) {
workbook = new XSSFWorkbook(inputStream);
} else if (excelFilePath.endsWith("xls")) {
workbook = new HSSFWorkbook(inputStream);
} else {
throw new IllegalArgumentException("The specified file is not Excel file");
}
return workbook;
}
And here’s a usage example of this factory method:
String excelFilePath = "Books.xlsx"; // can be .xls or .xlsx
FileInputStream inputStream = new FileInputStream(new File(excelFilePath));
Workbook workbook = getWorkbook(inputStream, excelFilePath);
Reading Other Information
Get a specific sheet:
Sheet thirdSheet = workbook.getSheetAt(2);
Get sheet name:
String sheetName = sheet.getSheetName();
Get total number of sheets in the workbook:
int numberOfSheets = workbook.getNumberOfSheets();
Get all sheet names in the workbook:
int numberOfSheets = workbook.getNumberOfSheets();
for (int i = 0; i < numberOfSheets; i++) {
Sheet aSheet = workbook.getSheetAt(i);
System.out.println(aSheet.getSheetName());
}
Nam Ha Minh is certified Java programmer (SCJP and SCWCD). He began programming with Java back in the days of Java 1.4 and has been passionate about it ever since. You can connect with him on Facebook and watch his Java videos on YouTube.
Hi Lakshmipriya, It's better to know the format of the Excel files beforehand. Otherwise read all cell values as Object and determine the type using instanceof operator.
I feel there is little dependency, during setting a value in setter method of Book class we are doing type casting from object to corresponding field type of book attributes, like there we are chking 1 means String,2. String, 3. double like so. How about a larger file ? Here we have 2 fields so we know the type.. how about more coulmns if we have?
Comments
It's better to know the format of the Excel files beforehand. Otherwise read all cell values as Object and determine the type using instanceof operator.
fields so we know the type.. how about more coulmns if we have?