The tool list below offers resources with at least some free functionality which you can use to move data from one format to another as necessary to answer your research questions. This is not an exclusive list, nor does the presence of the tool on this list indicate a requirement that your team use it. This is a reference list only.
A tool for mining data locked in .pdf fiiles. Tabula allows you to extract that data into a CSV or Microsoft Excel spreadsheet using a simple, easy-to-use interface. Tabula works on Mac, Windows and Linux.
With Trifacta Wrangler, any user with a Mac or Windows machine and Internet connection is able to download, install and start using Trifacta immediately. Trifacta Wrangler empowers analysts to wrangle diverse data sources on their desktop in preparation for use in analytics or visualizations tools such as Tableau. Trifacta Wrangler does not require an underlying storage and processing environment outside of what’s already available on modern Mac and Windows machines. By leveraging the best-of-breed hybrid architecture of a connected desktop application (used by Spotify, Slack, etc…), users are able to have the agility of working with data locally on their machine while also benefitting from the advantages of seamless product updates and metadata access over an internet connection.
Python of course is an excellent language for data manipulation. Add on the Pandas library, which includes its DataFrame object, and data scientists can quickly perform even more complex operations. For example, merging, joining, and transforming huge hunks of data with a single Python statement.