A while back I wrote a post on runtime modification of remote site settings and mentioned some great work that had been done around Zip File support and the ability to do deployments from Salesforce.
Link to that article here: runtime modification of remote site settings
Aside from this use case, and the more obvious uses of Zip files on Salesforce itself (rather than hacking VisualForce pages to process Zip Files), there is something far more fun you can do with the Zip code ….. Excel parsing!
If you trawl through the documentation for the .xlsx extension you find that an Excel document (and Word, PowerPoint, etc) are actually Zip packages that contain a specific set of files that are parsed and rendered by the appropriate application.
If you create a blank Excel workbook, Save it and then change the file extension to .zip, you get the following file structure:
So, all we have to do is repeat the same process using Zippex. To do this we instantiate an instance of the Zippex class and then load some stuff.
First you’ll notice I load the contents of the file ‘SharedStrings.xml’, this is exactly what it sounds like, a collection of strings that are referenced by the Excel document https://docs.microsoft.com/en-us/office/open-xml/working-with-the-shared-string-table
These are loaded into a map based on the count, I’ll be using these later to reconstruct the actual contents of the Spreadsheet.
Secondly, I grab each of the files under the worksheets directory and save these into a map based on the sheet name.
Right now the fun part …. I thought of a couple of options on how I wanted to access the values of the worksheet and after trying it a couple of times in “real world” scenarios I decided I just wanted a string value returned for each Sheet, Column and Row rather than end up with Arrays of strings, so the implementation is actually pretty simple.
How this works then is pretty self evident, call valueOf and hey presto!
There’s a couple of pieces of complexity missing for parsing the actual spreadsheet and finding a row count.
And the last piece is actually for parsing the Xml into something we can use.
Other than that is pretty simple!
Feel free to email me directly for the code, I’m happy to help remember of course that all this is based on the great work of Pedro Dal Col & Pliny Smith and the Zippex code.