About Me!

This blog is about my musings and thoughts. I hope you find it useful, at most, and entertaining, at least.

Résumé [PDF]

Other Pages

Quotes

Links

Presence Elsewhere

jim@jimkeener.com

GitHub

BitBucket

Why I'm a Luddite, kind-of (Part 2)

Date: 20220527

Tags: technology Luddite freedom liberty

In 2013 I wrote a ~~ramble~~ post called Why I’m a Luddite, kind-of. Now, some 9 years later, I want to revisit
some of my opinions then and add some more color to my basic premise. If
anything, I think my opinion has gone from “sometimes technology doesn’t
actually make things better” to “usually technology doesn’t actually
make things better.”


I’ve come to loathe “smart” interfaces.


In 2013 I was still two years away from having a child. While Anne and I
weren’t able to get our families to strictly adhere to the “no batteries
allowed” rule, they did follow it pretty well and I’m pretty happy with
the result. Our children watch a little more TV than I’d prefer. A
significant portion of it is made up of history and science
documentaries and they are (usually) more than willing or even beg to go
outside to play, so it’s a mixed bag.


One sentence, of many, I do think has aged very poorly and doesn’t really reflect
my current beliefs, though, is “I just wish so many people wouldn’t use
them as a substitute for real life interaction.” Before 2020, but
specially in a post-2020 World, that sentence was just me talking out of
my butt. While I still think real-life interactions are important,
interacting with others any way in which we’re comfortable is a
positive.

Visualizing System-wide Headway

Date: 20191113

Tags: postgis postgres gis gtfs transit

I was bored this evening and decided to visualized the system-wide headway,
that is the amount of time between buses at a stop for every stop in the
system.


Given the stop-distance table from the previous
post, I set out to compute the headway for each stop segment (the portion
of a route between stops).


— Compute the max headway for each stop per hour.
create table overall_headway as
select stop_id, hour, max(headway) as max_headway
from ( select stop_id, extract(hour from departure_time) as hour, departure_time, lag(departure_time, 1) over w as prev_departure_time, extract(epoch from (departure_time – lag(departure_time, 1) over w))/60 as headway from gtfs.stops_3365 join gtfs.stop_times using (stop_id) window w as (partition by stop_id order by departure_time asc)
)x
group by stop_id, hour;

— Assign the segments created for the stop-distance calculation
— a headway from overall_headway.
create table stop_line_headway as
select coalesce(ohs.hour, ohp.hour) as hour, sdl.stop_id, sdl.dist, prev_stop_id, min(least(ohs.max_headway, ohp.max_headway)) as headway, the_geom
from stop_dist_line sdl
left join overall_headway ohs on sdl.stop_id = ohs.stop_id
left join overall_headway ohp on sdl.prev_stop_id = ohp.stop_id
where (ohs.hour is null or ohp.hour is null or (ohs.hour = ohp.hour)) and st_length(the_geom) < 2500
group by coalesce(ohs.hour, ohp.hour), sdl.stop_id, sdl.dist, prev_stop_id, the_geom;

alter table stop_line_headway add id serial primary key;


I then pulled this into QGIS, colored by graduation, and filtered per hour.
For each our, I exported a PNG of the map and tried by hand at KDEnlive for the
first time. (Although ffmpeg could have gotten the job done as well.)


Raw video, images, and more details on building the map to come.

Stop Spacing, Part 1

Date: 20191112

Tags: postgis postgres gis gtfs transit

For reasons I’ll get into in another post, I wanted to have a rough idea of the
distribution of the distance between stops. To do this I used the
GTFS import for the Port Authority of Allegheny County
GTFS
that
I did earlier this year when calculating the service
area
.


A quick note about projections. Since I had created a table with a geometry in
the Pennsylvania South (ftUS) state plane( EPGS 3365),
any distance calculations I do will be done in feet.


A quick, rough and dirty experiment went like this:


create table stop_dist_line as 
select stop.stop_id, stop.stop_name, prev.stop_id as prev_stop_id, prev.stop_name as prev_stop_name, — Just using euclidean distance right now. — This could be improved by using the on-the-street distance. stop.the_geom <-> prev.the_geom as dist, st_setsrid(st_makeline(stop.the_geom, prev.the_geom), 3365) as the_geom
from ( — This gets unique pairs of stops, regardless of ordering. select distinct least(stop_id, prev_stop_id) as prev_stop_id, greatest(stop_id, prev_stop_id) as stop_id — This gets all pairs of stops. In theory there could be duplicate pairs — based on direction. from (select stop_id, lag(stop_id, 1) over stop_order prev_stop_id from gtfs.stop_times join gtfs.trips using (trip_id) window stop_order as (partition by trip_id, direction_id order by stop_sequence) ) unordered_pairs
) pairs
join gtfs.stops_3365 stop on stop.stop_id = pairs.stop_id
join gtfs.stops_3365 prev on prev.stop_id = pairs.prev_stop_id;

— Remove segments that don’t have a line because there is no
— previous stop to have a line or have a distance from.
delete from stop_dist_line where prev_stop_id is null;

— Register the new geometry column
select Populate_Geometry_Columns();
— Add a PK to play more nicely with QGIS
alter table stop_dist_line add id serial primary key;
— Add an index on the geometry.
create index on stop_dist_line using gist(the_geom);


We can then get some idea of the distribution by bucketing by 100s.


select dist_group,
       ttlsf,
       ttl,
       round(ttlsf/ttl, 2) as perc,
       c
from
  (select dist_group,
          sum(c) over (rows between unbounded preceding and current row) as ttlsf,
          sum(c) over () as ttl,
          c
   from (
      select 100*floor(dist / 100) as dist_group,
             count(*) as c
      from stop_dist_line
      group by dist_group
      order by dist_group) x)y;

Giving


 dist_group | ttlsf | ttl  | perc |  c
------------+-------+------+------+------
          0 |   149 | 7732 | 0.02 |  149
        100 |   223 | 7732 | 0.03 |   74
        200 |   681 | 7732 | 0.09 |  458
        300 |  1568 | 7732 | 0.20 |  887
        400 |  2528 | 7732 | 0.33 |  960
        500 |  3579 | 7732 | 0.46 | 1051
        600 |  4462 | 7732 | 0.58 |  883
        700 |  5135 | 7732 | 0.66 |  673
        800 |  5589 | 7732 | 0.72 |  454
        900 |  5946 | 7732 | 0.77 |  357
       1000 |  6234 | 7732 | 0.81 |  288
       1100 |  6456 | 7732 | 0.83 |  222
       1200 |  6628 | 7732 | 0.86 |  172
       1300 |  6766 | 7732 | 0.88 |  138
       1400 |  6852 | 7732 | 0.89 |   86
       1500 |  6921 | 7732 | 0.90 |   69
       ...

However, because PostgreSQL is awesome, we can also just compute percentiles
directly.


select percentile_cont(0.25) within group (order by dist) as p_25,
       percentile_cont(0.35) within group (order by dist) as p_35,
       percentile_cont(0.45) within group (order by dist) as p_45,
       percentile_cont(0.50) within group (order by dist) as p_50
from stop_dist_line;

Giving


        p_25       |       p_35       |       p_45       |       p_50
------------------+------------------+------------------+------------------
 438.775035648918 | 518.085656645151 | 590.352075165022 | 631.007847560984

Next steps would be to correct many of the little corners cut, e.g. not using
on-the-street distance, and rerunning the calculations.

Calculating Service Area, Part 4: Next Steps

Date: 20190127

Tags: postgis postgres transit census

In Part 1 we
talked about the project and got some of the tools and data we need to
start. A quick reminder of our motivation: to figure out the percentage
area and population the county transit agency serves by using a
“walkshed”.


Parts



Walkshed


There are two changes I should make to the walkshed calculation before I dig
into per-route statistics: (1) excluding interstates and (2) light rail and
busways stations should have a half mile walkshed. I don’t believe they made a
difference at the county level; once I redo the calculations I’ll come back to
see if I have to eat those words.


More In-Depth Demographic Analysis


Since we have all of the census data, all of the route data, and we broke our
walkshed down by stop it’d be quite easy to look at the service area as a
whole, and routes in particular against other socio-economic indicators.


Also, more interestingly, who isn’t being served?


Filter Walkshed by Reliable Transit


I’d love to look at what the walkshed is when only stops that are served more
frequently than every 30 min mid-day is included. Another alternative would be
late at night, or weekends. Essentially, what’s the service area that you could
do a good deal in without having to get into a car?


Build a web-based tool to export this data


I would love to make this data more usable and accessible. I previously made a
little tool using python, mapnik, and a
file from GDAL2Tiles. I’d like to explore if QGIS’ “save project to database“
functionality could be used (i.e. mangled, forced, hacked?) to do this more
cleanly.


Isochron / Isotrip maps


I’ve been wanting to build an application where you could pick a location
and it’d generate two (interactive) maps of all other stops: an isochron map
showing how long it’d take to get anywhere else (and that has 15-min isochron
lines showing the boundry of how long it takes to get how far) and an isotrip
map showing how many trips/transfers it’d take to get to the other stop, with
similar isotrip lines.


Create Subject Tables


The Access DBs also contain information on what all the fields mean and which
subject tables they belong to. We can extract it with the mdb-export tool.

mdb-export SF1_Access2003.mdb data_field_descriptors &gt; data_field_descriptors.csv

Calculating Service Area, Part 3: Demographics

Date: 20190127

Tags: postgis postgres transit census

In Part 1 we
talked about the project and got some of the tools and data we need to
start. A quick reminder of our motivation: to figure out the percentage
area and population the county transit agency serves by using a
“walkshed”.


In Part 2, we
defined what our walkshed is and defined how we would be linking it to the
Census 2010. In this post, we’ll be doing the linking


I want to point out sproke’s blogpost from
2012

where I got many of these ideas many years back.


Parts



Importing the Census 2010 Data


First we need to gather the data from census.gov and
initialized our database. I’m going to grab the documentation (sf1.pdf), along with
some MS Access database files they use to store the schema.


wget --continue -e robots=off --force-directories --no-parent https://www.census.gov/prod/cen2010/doc/sf1.pdf
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/04-Summary_File_1/SF1_Access2003.mdb
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/04-Summary_File_1/SF1_Access2007.accdb
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/04-Summary_File_1/Pennsylvania/
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/05-Summary_File_2/SF2_MSAccess_2003.mdb
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/05-Summary_File_2/SF2_MSAccess_2007.accdb
wget --recursive --continue -e robots=off --force-directories --no-parent https://www2.census.gov/census_2010/05-Summary_File_2/Pennsylvania/

Some notes, --recursive will grab linked files, --continue will try to
start mid-file if it’s been partially downloaded. -e robots=off is naughty
and tels wget to download things even if robots.txt tells you not to. (I don’t
think it’s an issue here, I just add it all the time.) --force-directories
will create a directory structure begining with the domain so
“https://example.com/a/b/c.d” will be downloaded to a file
“example.com/a/b/c.d” — I find this makes keeping track of what things are and
where they came from a lot easier. --no-parent tells wget not to go up the
directory tree to get assets, which could cause it to begin going down sibling
directories of the root we gave it.


Now, we are going import that data. The first step
is to unzip the datafile. Be warned, the unziped data can swell in size. For
example, the pa dataset is compressed 90%: the zip is 436MiB and uncompressed
is 4.4GiB.


unzip pa2010.sf1.zip

Now that we have the files, let’s build the schema. Mostly because I love
using the commandline whenever possible, I decided to stream the schema
from the Access sample databases through some fix-ups and then into a file.


mdb-schema SF1_Access2003.mdb |
  # MS Access uses [] as idenifier quotes, let’s fix that
  tr ‘[]’ ‘”“’ |

# Some tables have a name suffix of `mod` because the first few have been # removed to keep the table under 255 fields; let’s add them back in. sed ‘/mod”$/{$!{N;s/mod”\n (/”\n (\n “FILEID” text,\n “STUSAB” text,\n “CHARITER” text,\n “CIFSN” text,/g}}’ | # Remove spaces from identifiers. sed -E ‘s/(\s+)”([”]+) ([”]+)”/\1\2_\3/g’ | # Convert to PG type names sed -E ‘s/Text\s*\([0-9]+\)/text/g’ | sed -E ‘s/(Long )?Integer/bigint/g’ | sed -E ‘s/Double/double precision/g’ | # Table 45 is split into 2 parts; let’s merge them by deleting the trailing # ); of the one and the CREATE TABLE along with the first few fields of the # second. sed ‘s/_PT1//g’ | sed ‘/);$/{$!{N;N;N;N;N;N;N;N;s/.*_PT2.*LOG[^,]*//g}}’ | # Just remove the quotes — they’re not needed anymore tr -d ‘”’ > SF1_pg.schma.sql

Then, we can send that into postgres, but I want to wrap it all in a commit
so that we don’t end up with a half-built database if something fails.


cat &lt;(echo "set role census;") \
    &lt;(echo "begin; create schema census2010; set search_path = census2010;") \
    SF1_pg.schma.sql \
    &lt;(echo "commit;") |
  psql -v ON_ERROR_STOP=1 census

Now the fun part! We get to import the datafiles! I’m going to use the
copy statement and the
psql \copy meta-command. However, like above,
I’m going to build up a sql file before running it so that everything can
be wrapped in a transaction. Yes, this step can take a bit and if it fails
towards the end it sucks, but in the end it doesn’t take that long (a few
minutes) and it’s significantly easier than trying to scrub duplicates. (We
haven’t built indices yet as having them during the input slows it down
considerably.


echo "begin;" &gt; import_script
for input_file in pa000*2010.sf1; do
  table_name=SF1_`echo $input_file | sed 's/pa\(.\{5\}\)2010.sf1/\1/'`
  # Provides some feedback as `copy` will only output the number of rows imported.
  echo "\\\echo $input_file  $table_name" &gt;&gt; import_script
  echo "\\\copy census2010.${table_name} from '${input_file}' with (format csv, header false);\n" &gt;&gt; import_script
done
echo "commit;" &gt;&gt; import_script
cat import_script | psql -v ON_ERROR_STOP=1 census

Now comes the more interesting part: the geoheader (links demographic records
with the geography) is a fixed-width file (i.e. fields are not delimited, but
are defined as between character positions on a line). What Sophia did was
to build a staging table that would contain the raw lines, and then do a
“insert into select” with a select statement that teased out the fields using
the substring function. We’ll do a very similar thing. The fields are
defined in the SF1
documentation


echo "set role census; CREATE TABLE census2010.geo_header_staging(data text);" | psql census
echo "\\\copy census2010.geo_header_staging from 'pageo2010.sf1';" | psql census

Then we’ll tease apart that staging table.


INSERT INTO census2010.geo_header_sf1 (fileid, stusab, sumlev, geocomp, chariter, cifsn, logrecno, region, division, state, county, countycc, countysc, cousub, cousubcc, cousubsc, place, placecc, placesc, tract, blkgrp, block, iuc, concit, concitcc, concitsc, aianhh, aianhhfp, aianhhcc, aihhtli, aitsce, aits, aitscc, ttract, tblkgrp, anrc, anrccc, cbsa, cbsasc, metdiv, csa, necta, nectasc, nectadiv, cnecta, cbsapci, nectapci, ua, uasc, uatype, ur, cd, sldu, sldl, vtd, vtdi, reserve2, zcta5, submcd, submcdcc, sdelem, sdsec, sduni, arealand, areawatr, name, funcstat, gcuni, pop100, hu100, intptlat, intptlon, lsadc, partflag, reserve3, uga, statens, countyns, cousubns, placens, concitns, aianhhns, aitsns, anrcns, submcdns, cd113, cd114, cd115, sldu2, sldu3, sldu4, sldl2, sldl3, sldl4, aianhhsc, csasc, cnectasc, memi, nmemi, puma, reserved)
SELECT
nullif(trim(substring(data,1,6)), '') AS fileid,
nullif(trim(substring(data,7,2)), '') AS stusab,
nullif(trim(substring(data,9,3)), '') AS sumlev,
nullif(trim(substring(data,12,2)), '') AS geocomp,
nullif(trim(substring(data,14,3)), '') AS chariter,
nullif(trim(substring(data,17,2)), '') AS cifsn,
nullif(trim(substring(data,19,7)), '')::bigint AS logrecno,
nullif(trim(substring(data,26,1)), '') AS region,
nullif(trim(substring(data,27,1)), '') AS division,
nullif(trim(substring(data,28,2)), '') AS state,
nullif(trim(substring(data,30,3)), '') AS county,
nullif(trim(substring(data,33,2)), '') AS countycc,
nullif(trim(substring(data,35,2)), '') AS countysc,
nullif(trim(substring(data,37,5)), '') AS cousub,
nullif(trim(substring(data,42,2)), '') AS cousubcc,
nullif(trim(substring(data,44,2)), '') AS cousubsc,
nullif(trim(substring(data,46,5)), '') AS place,
nullif(trim(substring(data,51,2)), '') AS placecc,
nullif(trim(substring(data,53,2)), '') AS placesc,
nullif(trim(substring(data,55,6)), '') AS tract,
nullif(trim(substring(data,61,1)), '') AS blkgrp,
nullif(trim(substring(data,62,4)), '') AS block,
nullif(trim(substring(data,66,2)), '') AS iuc,
nullif(trim(substring(data,68,5)), '') AS concit,
nullif(trim(substring(data,73,2)), '') AS concitcc,
nullif(trim(substring(data,75,2)), '') AS concitsc,
nullif(trim(substring(data,77,4)), '') AS aianhh,
nullif(trim(substring(data,81,5)), '') AS aianhhfp,
nullif(trim(substring(data,86,2)), '') AS aianhhcc,
nullif(trim(substring(data,88,1)), '') AS aihhtli,
nullif(trim(substring(data,89,3)), '') AS aitsce,
nullif(trim(substring(data,92,5)), '') AS aits,
nullif(trim(substring(data,97,2)), '') AS aitscc,
nullif(trim(substring(data,99,6)), '') AS ttract,
nullif(trim(substring(data,105,1)), '') AS tblkgrp,
nullif(trim(substring(data,106,5)), '') AS anrc,
nullif(trim(substring(data,111,2)), '') AS anrccc,
nullif(trim(substring(data,113,5)), '') AS cbsa,
nullif(trim(substring(data,118,2)), '') AS cbsasc,
nullif(trim(substring(data,120,5)), '') AS metdiv,
nullif(trim(substring(data,125,3)), '') AS csa,
nullif(trim(substring(data,128,5)), '') AS necta,
nullif(trim(substring(data,133,2)), '') AS nectasc,
nullif(trim(substring(data,135,5)), '') AS nectadiv,
nullif(trim(substring(data,140,3)), '') AS cnecta,
nullif(trim(substring(data,143,1)), '') AS cbsapci,
nullif(trim(substring(data,144,1)), '') AS nectapci,
nullif(trim(substring(data,145,5)), '') AS ua,
nullif(trim(substring(data,150,2)), '') AS uasc,
nullif(trim(substring(data,152,1)), '') AS uatype,
nullif(trim(substring(data,153,1)), '') AS ur,
nullif(trim(substring(data,154,2)), '') AS cd,
nullif(trim(substring(data,156,3)), '') AS sldu,
nullif(trim(substring(data,159,3)), '') AS sldl,
nullif(trim(substring(data,162,6)), '') AS vtd,
nullif(trim(substring(data,168,1)), '') AS vtdi,
nullif(trim(substring(data,169,3)), '') AS reserve2,
nullif(trim(substring(data,172,5)), '') AS zcta5,
nullif(trim(substring(data,177,5)), '') AS submcd,
nullif(trim(substring(data,182,2)), '') AS submcdcc,
nullif(trim(substring(data,184,5)), '') AS sdelem,
nullif(trim(substring(data,189,5)), '') AS sdsec,
nullif(trim(substring(data,194,5)), '') AS sduni,
nullif(trim(substring(data,199,14)), '')::double precision AS arealand,
nullif(trim(substring(data,213,14)), '')::double precision AS areawatr,
nullif(trim(substring(data,227,90)), '') AS name,
nullif(trim(substring(data,317,1)), '') AS funcstat,
nullif(trim(substring(data,318,1)), '') AS gcuni,
nullif(trim(substring(data,319,9)), '')::bigint AS pop100,
nullif(trim(substring(data,328,9)), '')::bigint AS hu100,
nullif(trim(substring(data,337,11)), '') AS intptlat,
nullif(trim(substring(data,348,12)), '') AS intptlon,
nullif(trim(substring(data,360,2)), '') AS lsadc,
nullif(trim(substring(data,362,1)), '') AS partflag,
nullif(trim(substring(data,363,6)), '') AS reserve3,
nullif(trim(substring(data,369,5)), '') AS uga,
nullif(trim(substring(data,374,8)), '') AS statens,
nullif(trim(substring(data,382,8)), '') AS countyns,
nullif(trim(substring(data,390,8)), '') AS cousubns,
nullif(trim(substring(data,398,8)), '') AS placens,
nullif(trim(substring(data,406,8)), '') AS concitns,
nullif(trim(substring(data,414,8)), '') AS aianhhns,
nullif(trim(substring(data,422,8)), '') AS aitsns,
nullif(trim(substring(data,430,8)), '') AS anrcns,
nullif(trim(substring(data,438,8)), '') AS submcdns,
nullif(trim(substring(data,446,2)), '') AS cd113,
nullif(trim(substring(data,448,2)), '') AS cd114,
nullif(trim(substring(data,450,2)), '') AS cd115,
nullif(trim(substring(data,452,3)), '') AS sldu2,
nullif(trim(substring(data,455,3)), '') AS sldu3,
nullif(trim(substring(data,458,3)), '') AS sldu4,
nullif(trim(substring(data,461,3)), '') AS sldl2,
nullif(trim(substring(data,464,3)), '') AS sldl3,
nullif(trim(substring(data,467,3)), '') AS sldl4,
nullif(trim(substring(data,470,2)), '') AS aianhhsc,
nullif(trim(substring(data,472,2)), '') AS csasc,
nullif(trim(substring(data,474,2)), '') AS cnectasc,
nullif(trim(substring(data,476,1)), '') AS memi,
nullif(trim(substring(data,477,1)), '') AS nmemi,
nullif(trim(substring(data,478,5)), '') AS puma,
nullif(trim(substring(data,483,18)), '') AS reserved
FROM census2010.geo_header_staging;

Now, let’s delete that staging table.


drop table census2010.geo_header_staging;

Let’s build some indices and foreign keys.

echo “begin;” > create_index
for table in `echo “\\\dt census2010.*” | psql -tA census | grep -v data | cut -f 2 -d \|`; do
  cat <<EOS >> create_index
create index on census2010.${table}(fileid);
create index on census2010.${table}(stusab);
create index on census2010.${table}(chariter);
create index on census2010.${table}(cifsn);
create unique index on census2010.${table}(logrecno);

	

alter table census2010.${table} alter column fileid set not null;
alter table census2010.${table} alter column stusab set not null;
alter table census2010.${table} alter column chariter set not null;
alter table census2010.${table} alter column cifsn set not null;
alter table census2010.${table} alter column logrecno set not null;

alter table census2010.${table} add primary key using index ${table}_logrecno_idx;

EOS
done
echo “commit;” >> create_index
cat create_index | psql -v ON_ERROR_STOP=1 census

create unique index on census2010.geo_header_sf1(fileid, stusab, chariter, cifsn, logrecno);

echo “begin;” > create_fk
for table in `echo “\\\dt census2010.*” | psql -tA census | grep -v data | grep -v geo | cut -f 2 -d \|`; do cat <<EOS >> create_fk
alter table census2010.${table} add constraint ${table}_geo_fk foreign key (logrecno) references census2010.geo_header_sf1(logrecno);
EOS
done
echo “commit;” >> create_fk
cat create_fk | psql -v ON_ERROR_STOP=1 census


Service Area Population


In order to speed up lookups between the TIGER data, let’s create an index to
use. sumlev (Summary Level) is the aggregation that the row represents,
e.g. state, county, track, block group, block, &c. They’re defined in the
SF1 documentation. 101 is
the block level.


create index on census2010.geo_header_sf1 (state, county, tract, block) where sumlev = '101';

Calculating the total population of the county can be done by simply using
field p0010001 (total population) in table sf1_00001 (also defined in the
SF1 documentation), but only for the rows representing the block summary level.
Without filtering on the summary level the population would be counted for
every summary level in the county.


select sum(p0010001) 
from census2010.sf1_00001 
join census2010.geo_header_sf1 using (logrecno)
where county = '003' 
  and sumlev = '101';

For the population in our walkshed, we will use the stops_walkshed_blocks
table that we generated in Part 2. As a recap, that is every block that is
adjacent to a segment of road that is within a quarter mile of a bus stop. (The
reason for the group by is that stops_walkshed_blocks is broken down by
stop, which obviously leads to some blocks being counted multiple times.)


select sum(p0010001)
from (
  select max(p0010001) as p0010001 
  from census2010.sf1_00001
  join census2010.geo_header_sf1 using (logrecno) 
  join tiger2017.tabblock 
    on  statefp10 = state 
    and countyfp10 = county
    and tractce10 = tract 
    and blockce10 = block
  join stops_walkshed_blocks 
    on stops_walkshed_blocks.geoid10 = tabblock.geoid10
  where county = '003' and sumlev = '101'
  group by tabblock.geoid10) x;

Results


This yields a total population of 1,223,348 and a service area population of
775,835 which is just over 63% of the county’s population.