Last semester, several of my students ran afoul of a perennial problem with PowerPoint. They had created their slide decks on a large screen, but when they connected to the room's projector, it forced a lower screen resolution.

Blammo! -- ugly slides. Text too big, images cut off, broken layouts everywhere. The wreckage was so horrifying, two design students in the audience were forced to avert their eyes.

There are several ways to solve this problem, but here's the trick I use all the time: Instead of showing your slides in PowerPoint, make your presentation deck into a PDF, and show that. PDFs will scale, unlike PowerPoint, and will capture your presentation faithfully, no matter how different the resolution might be.[^faithfully]

[^faithfully]: Well, I'm not sure if animation or sound will work. Probably not, actually. But I see that as a feature, not a bug.

Here's how you do it:

Read More...

1. Install software that allows you to create a PDF

Several free software packages allow you to create a PDF.

  • LibreOffice is a free, open source office suite (essentially a replacement for Microsoft Office), which has a PDF export feature. There are versions for Windows, Macs, and Linux. (An earlier version was called OpenOffice.) You can use LibreOffice Impress (a PowerPoint work-alike) to create your presentation, or just open your PowerPoint file in Impress to export it to PDF. This is what I usually use.
  • PDFCreator is a free and open source program to help you create PDFs in Windows. When you install it, PDFCreator makes a new printer device. Then you print your document -- in this case, your presentation slides -- to this special printer, and it saves the output as a PDF. This works from any program that can print, not just PowerPoint.
  • On a Mac, it's even easier -- the capability to print to a PDF is built right into the system. This tutorial from MIT shows how to do it (with pictures).
  • Microsoft has even created a downloadable add-in to allow you to generate PDFs from within Microsoft Office.
  • If you happen to have paid for Adobe Acrobat (not just Acrobat Reader), you can generate PDFs with Acrobat.

2. Create your presentation in PowerPoint in the normal way

3. Generate a PDF of your final presentation

Create your PDF by printing to the PDF printer from within PowerPoint. If you're using LibreOffice, just click the PDF button in the toolbar to create it.

4. Deliver your presentation from your PDF instead of your PowerPoint

To do this, just open the PDF in the normal way with Acrobat Reader or other PDF-viewing software.

The trick is to make it display full-screen. In Acrobat Reader, hit Ctrl-L to make it full-screen. In full-screen mode, use the arrow-keys or click the mouse to make the slides advance. (Left mouse button, down-arrow, or right-arrow all advance one slide; right mouse button, up-arrow, or left-arrow all go back one slide.) When you want to exit full-screen mode, hit the Esc key. (Mac users can use Adobe Reader, or see Apple's instructions for using Preview in full-screen mode.)

By delivering your presentation from a PDF, instead of a PowerPoint file, each slide will retain the layout you created for it originally. If the resolution is not the same once the projector is plugged in, a PDF will scale down gracefully, instead of scaling some elements and leaving others full-size as PowerPoint does.

Then you will be free to concentrate on delivering an excellent presentation -- not worrying about the visual mess your carefully-crafted slides could become.

Posted late Wednesday morning, January 8th, 2014 Tags:

This post is an explanation of the napptools scripts, including how they transform NAPP download files into a SQLite database.

Read More...

The tools

napptools consists of three script programs:

  • napp2csv.sh: A Bash script that uses traditional unix tools cut, sed, and tr to chop a NAPP data file into its respective columns, guided by a SAS-format command file. This also creates secondary tables in .CSV format from the variable descriptions in the SAS file.

  • csv2sqlite: A public-domain AWK program written by Lorance Stinson (available from http://lorance.freeshell.org/csvutils/) to convert .CSV files into a series of SQL statements that load the data into a database. Two small changes were made by Eric Nystrom to Stinson's original code to fix a bug and better fit the output to SQLite's capabilities by specifying non-typed columns.

  • nappbuild.sh: A Bash script to employ napp2csv.sh and csv2sqlite to create .CSV and .SQL files, then load them into a SQLite database.

Usage

  1. Get an account at http://nappdata.org, receive access approval, and select your desired variables. Download the fixed-width text data file itself, which will end in a .dat.gz extension, as well as a command file in SAS format.
  2. Ensure all dependencies are met. On most Linux systems the only one you may need to install will be sqlite3, the command-line client for the SQLite database package.
  3. Run nappbuild.sh in the directory containing your data file and your command file, passing the name of the SQLite database you wish to create.
    • If your data file is napp_00001.dat.gz and your command file is napp_00001.sas then run nappbuild.sh like so:
    • nappbuild.sh -i -n napp_00001 -d MyNAPPData.db
  4. From there, you can use your database from the SQLite command shell sqlite3 or your favorite programming language.

Database structure

For databases created with napptools, most of the NAPP data ends up in a single large table, called data. Each of the columns in data is named for the NAPP field, such as SERIAL, PERWT, NAMELAST, etc. (Since SQLite's column names are not case-sensitive, lower case works fine too.)

Some of these columns have self-contained information, such as NAMELAST or OCCSTRNG, but others contain a numeric code that will typically need to be translated into human-readable values. Translations for these codes were offered by NAPP in the command file. The napptools suite breaks those translations out of the command file, into separate tables loaded into the SQLite database. These secondary tables are named for the NAPP variable, and always contain two columns, id and desc. With this information, it is easy to use a SQL JOIN command to bring the translations into your results, or you can refer to the codes directly if desired.

For example, to show the number of people listed in each category of the race variable in the state of New Mexico (stateus value of 35):

SELECT race.desc, count(*)
FROM data
JOIN race ON data.race = race.id
WHERE stateus = 35
GROUP BY race.desc

Some caveats

  • The SQLite index generation routine is rather crude, as it makes an index for every column in the data table and the id column in all secondary tables. This is likely overkill, but there's no doubt column indexes on at least some of the columns helps many queries.
  • This was designed and used on a Debian Linux system. It seems likely that it will be portable to similar unix-based systems as long as the dependencies are all met, but YMMV.
Posted early Friday morning, January 17th, 2014 Tags:

Historians often try not to fall in love with our sources, but sometimes I just can't help it. For me, aside from the chatty personal journal (what historian can resist?) and the underground mine maps I've studied for years, my greatest fondness may be for big compilations of small bits of data, called microdata.

What's microdata? It's small bits of information that, by themselves, might be virtually useless, but when aggregated and analyzed can show bigger trends.

The classic use of microdata is the census. Whether you filled out the form yourself or talked to the canvasser ("enumerator") who visited your house, small bits of information about you and your family were recorded. By itself, this isn't much -- half a dozen or more websites probably know more about you than the US Census Bureau does. But when placed together with similar information from other people, we can see broader trends -- this neighborhood is slipping into poverty, or that one has an emerging immigrant business community, or this county will need to plan for more school capacity because of the number of young children.

What if we could use that same kind of explanatory power to help understand the past? This question, and the use of microdata that it implies, has motivated historians since the advent of computers in the 1960s. The raw records of the census, which in the United States has been conducted every ten years since 1790, are a good source for this microdata. The raw records (termed "manuscript census" records) are released to researchers 72 years after they were created. One challenge is that they are handwritten (hence "manuscript"), meaning that any researcher would have to carefully transcribe the handwritten documents before being able to use them as microdata. But that transcription only needs to be done once, if researchers do a good job and are willing to share.

Thank goodness those people exist! Some of the best are at the Minnesota Population Center (MPC) at the University of Minnesota. They compile and make available microdata to researchers through several different projects. Their most famous one, in the research world, is IPUMS, the Integrated Public Use Microdata Series, but they also have important lesser-known microdata compilations.

The one I've used most is the North Atlantic Population Project (NAPP). The MPC collaborated with several other institutions on both sides of the Atlantic to make historical census microdata available from multiple countries. NAPP converts the data so that variables can be compared directly between countries. They also add additional variables, derived from information in each census, that extends the sorts of questions researchers can ask. Best of all, NAPP shares their microdata collections with researchers for free, provided you promise to cite it appropriately,[^nappcite] not redistribute the data, and not use it for genealogical purposes. In exchange for complying with these very moderate restrictions, NAPP makes available samples of census microdata covering the US, Canada, Great Britain, Sweden, Norway, and parts of present-day Germany during the 19th and early 20th century. Some of these samples are complete transcriptions of all of the manuscript census records, plus all the extra NAPP bells and whistles.

Most researchers use statistical software packages such as SPSS or STATA to browse and manipulate NAPP data. I took a different tack, creating a set of scripts that will manipulate a NAPP data file and load it into a database, where I can explore it with SQL, the standard language used to query databases. Learn more about the gory technical details. Over time, I will post some sample SQL statements here on this blog.

Microdata census records combined with NAPP's additional variables and a powerful search tool make it easy to dive deeply into historical patterns large and small. As long as the limitations of the original census sources are kept in mind, an extraordinary range of questions -- about race, work, geography, gender, age, the family, and more -- can be asked that would be difficult to answer any other way. Can you blame me for having a soft spot in my heart for historical microdata?

[^nappcite]: Minnesota Population Center. North Atlantic Population Project: Complete Count Microdata. Version 2.0 [Machine-readable database]. Minneapolis: Minnesota Population Center, 2008. Additionally, each data set offered by NAPP has its own citation.

Posted early Friday morning, January 17th, 2014 Tags: