Where does this site get data from? In San Francisco, the SFMTA is the government authority tasked with giving out parking tickets. On average, they issue 1 parking ticket every 24 seconds. They employ about 300 officers who drive these tiny single-seat vehicles around, looking for violations:
I thought it would be interesting to visualize parking ticket data. I discovered that the city website people use to pay their tickets also includes a full copy of the citation. But you need to know the citation ID number, which presumably you only know if you have the ticket in your hand. I don’t have a car, but my roommate does and he got a ticket recently.
This is gold! I can see everything: the make, color, location, the reason for the ticket, license plate, and even the initials of the officer who wrote the ticket. I would really love to be able to see EVERY ticket. But I don’t know the IDs of every ticket. BUT WAIT. These numbers look like they go sort of in order. Current IDs are somewhere around 992,000,000. And tickets issued a couple months ago were around 988,000,000.
I was looking at ticket 984,946,605. When I type in 1 higher, 984,946,606, no ticket is found. That makes sense, there definitely aren’t close to a billion tickets issued. Well, how are these IDs generated? There must be some logic to it.
After digging for a while, I’m pretty sure I know how. It seems each possible ticket number follows a pattern: add 11, except add 4 if the last digit is 6. So no ticket can end in 7, 8, or 9. So the ticket after 984,946,606 is actually 984,946,610, and after that is 984,946,621. Only God knows why, but I assume this is a remnant of an old system.
I also discovered the devices that parking cops use probably claim IDs in batches of 100. So if an officer just wrote a ticket, I know with certainty the ID after that one will be the next ticket the same officer writes. AND, immediately after a ticket is written, it becomes available to be viewed on the city’s website.
With all of this knowledge, I wrote a scraper that works extremely efficiently and stores parking tickets almost immediately after they are written. Because there are about 300 parking cops, there are about 300 incomplete batches of 100 tickets. All I have to do is check the first ticket in each batch I don’t have in my database. This approach means I only have to make a request to the city’s website every few seconds. Great!
☹️