This blog is about my musings and thoughts. I hope you find it useful, at most, and entertaining, at least.
In GIS a projection defines how a three dimentional object is drawn in two dimentions. There are two basic types of projections, Geographic Coordinate Systems (GCS) and Projected Coordinate Systems (PCS). While there is much to be said about different types of projections, and I would highly recommend starting with the wikipedia article and moving to the ones on the ESRI site, here I’m going to focus on the difference and usage GCS and PCS.
GCS is a three dimensional model that is fit to the Earth and how to map that to a two dimensional surface. A position on it is measured as the angle north or south of the Equator and east or west of the Prime Meridian. An example is the familar WGS84/EPSG 4326 (GPS coordinates). However, since the Earth isn’t a perfect spheroid, GCS are OK everywhere, but better in some places.
The green line represents the shape of the Earth, whereas the black line represents a perfect sheroid; the match is very close in the bottom right, but not as good in the top left.
A PCS, on the other hand, is defined on a two dimensional surface and how to map that onto a three dimentional model. A position on it is measured as a unit of length (often the meter or foot) left or right and above or below an origin. Am example is the Pennsylvania State Plane South which is valid for the souther counties in Pennsylvania and UTM Zone 17N (used in the Military Grid Reference System).
PCS are valid over a smaller area than a GCS, but allow for more accurate measure of length and area within those smaller areas; this is because the Earth isn’t spherical and an angular change at the Equator is not equal to an angular change on the Prime Meridian. And the irregularities of the shape of the Earth make compesnsation for the spheroidal shape still imperfect.
As an example, if I wanted to create a 110mi buffer around Pittsburgh, which is approximately 2 degrees at my latitude.
However, I know that Erie is roughly 110mi from Pittsburgh (about 120mi driving), so I immediately knew something was wrong. It dawned on me that I had used a GCS and I’m a bit beyond where you can fudge a measure up equals a measure left. I converted my layer to a PCS, Zone 17N, which uses meters as its base unit and created a 178km buffer.
That looked more realistic and I double checked some distances to some destinations in Ohio, Pennsylvania, and Weste Virginia to be sure.
To illustrate the difference, I overlayed my original 2 degrees buffer.
You can see that 2 degrees, when measured in meters, is an oval, and not a circle.
To sum up: use a GCS when you are dealing with the relationship between objects over a very wide area and a PCS when you are dealing with linear measurments (e.g. distance and area). Which GCS or PCS to use is another set of posts.
The Allegheny County Offered by the Office of Property Assessments offers a CD with information for all of the properties in the county that is updated quarterly. The CD contains a single 242MB file, currently in xlsx format.
To convert the Microsoft format to a format which can be consumed by PostgreSQL, namely CSV, LibreOffice can be used:
libreoffice --headless --convert-to csv ALLEGHENY_COUNTY_MASTER_FILE_07022015.xlsx
We can now create a table to import the CSV directly into. I created this schema by looking at the data found in the csv and correcting any errors found on import. PostgreSQL is very nice in that it won’t truncate data and will complain if invalid data is inserted. If you choose to use MySQL, please be aware that it can truncate and coerce data.
CREATE TABLE tax_assessments ( parid character(16), propertyowner character varying(255), propertyhousenum character varying(255), propertyfraction character varying(255), propertyaddress character varying(255), propertycity character varying(255), propertystate character varying(255), propertyunit character varying(255), propertylocation2 character varying(255), propertyzip character(5), municode character varying(4), munidesc character varying(255), schoolcode character varying(4), schooldesc character varying(255), neighcode character varying(50), neighdesc character varying(255), taxcode character(1), taxdesc character varying(255), ownercode character(2), ownerdesc character varying(255), statecode character(1), statedesc character varying(255), usecode character(3), usedesc character varying(255), lotarea integer, homesteadflag character(3), farmsteadflag character(3), saledate character(8), saleprice integer, salecode character(2), saledesc character varying(255), deedbook character varying(50), deedpage character varying(50), mabt character(1), agent character varying(255), taxfulladdress1 character varying(255), taxfulladdress2 character varying(255), taxfulladdress3 character varying(255), taxfulladdress4 character(5), changenoticeaddress1 character varying(255), changenoticeaddress2 character varying(255), changenoticeaddress3 character varying(255), changenoticeaddress4 character(5), countybuilding integer, countyland integer, countytotal integer, countyexemptbldg integer, localbuilding integer, localland integer, localtotal integer, fairmarketbuilding integer, fairmarketland integer, fairmarkettotal integer, style character(2), styledesc character varying(255), stories numeric, yearblt character(4), exteriorfinish character(1), extfinish_desc character varying(255), roof character(1), roofdesc character varying(255), basement character(1), basementdesc character varying(255), grade character varying(3), gradedesc character varying(255), condition character(1), conditiondesc character varying(255), totalrooms integer, bedrooms integer, fullbaths integer, halfbaths integer, heatingcooling character(1), heatingcoolingdesc character varying(255), fireplaces integer, attachedgarages integer, finishedlivingarea integer, cardnumber integer, alt_id character varying(255), taxsubcode_desc character varying(255), taxsubcode character(1) ); CREATE INDEX parid_idx ON tax_assessments ( parid );
From there we can import the CSV:
COPY tax_assessments FROM 'ALLEGHENY_COUNTY_MASTER_FILE_07022015.csv' WITH CSV HEADER
Here’s an assorted set of some of the fields. You’ll note that these are sorted lexically, not numerically and that the fields are character fields (see above). This is intentional because since I didn’t create this data, I can’t always guarentee there won’t be a change in the various codes later on. See
heatingcooling. There is no reason you couldn’t make them integer fields in your own import as applicable.
|12||REGULAR-ETUX OR ET VIR|
|16||REGULAR-ETAL & ETUX|
|17||Fox Chapel Area|
|30||Penn Hills Twp|
|36||South Fayette Twp|
|42||Upper St Clair|
|45||West Mifflin Area|
|47||City Of Pittsburgh|
|1||RES SKELETON RECORD|
|102||LIVE STOCK FARM|
|105||FRUIT & NUT FARM|
|109||GREENHOUSES, VEG & FLORACULTURE|
|110||>10 ACRES VACANT|
|112||LIVESTOCK O/T D & P-CAUV|
|115||FRUIT & NUT FARM – CAUV|
|118||CONDOMINIUM COMMON PROPERTY|
|120||TIMBER OR FOREST LAND|
|130||RIGHT OF WAY – RESIDENTIAL|
|131||RETENTION POND – RESIDENTIAL|
|199||OTHER AGRICULTURAL – CAUV|
|2||COM SKELETON RECORD|
|210||COAL LAND, SURFACE RIGHTS|
|220||COAL RIGHTS, WORKING INTERESTS|
|230||COAL RIGHTS SEP. ROYALTY INTEREST|
|240||OIL & GAS RIGHTS WORKING INTEREST|
|300||VACANT INDUSTRIAL LAND|
|310||FOOD & DRINK PROCESSING|
|317||FORESTRY WITH BUILDING|
|345||BULK TRANSFER TERMINAL|
|360||INDUSTRIAL TRUCK TERM|
|380||MINES AND QUARRIES|
|400||VACANT COMMERCIAL LAND|
|401||APART: 5-19 UNITS|
|409||BED & BREAKFAST|
|410||MOTEL & TOURIST CABINS|
|412||NURSING HOME/PRIVATE HOS|
|413||INDEPENDENT LIVING (SENIORS)|
|415||MOBILE HOMES/TRAILER PKS|
|419||OTHER COMMERCIAL HOUSING|
|420||SMALL DETACHED RET|
|425||NEIGH SHOP CENTER|
|426||COMMUNITY SHOPPING CENTER|
|427||REGIONAL SHOPPING CENTER|
|429||OTHER RETAIL STRUCTURES|
|430||RESTAURANT, CAFET AND/OR BAR|
|435||DRIVE IN REST OR FOOD SERVICE|
|437||FAST FOOD/DRIVE THRU WINDOW|
|439||OTHER FOOD SERVICE|
|440||DRY CLEANING PLANTS/LAUNDRIES|
|445||SAVINGS AND LOANS|
|447||OFFICE – 1-2 STORIES|
|448||OFFICE-WALKUP -3 + STORIES|
|449||OFFICE-ELEVATOR -3 + STORIES|
|450||CONDOMINIUM OFFICE BUILDING|
|452||AUTO SERV STATION|
|454||AUTO SALES & SERVICE|
|458||GAS STATION KIOSK|
|462||GOLF DRIVING RANGE/MINIATURE|
|463||GOLF COURSES (PUBLIC)|
|464||BOWLING ALLEYS/REC FACILITY|
|465||LODGE HALL/AMUSEMENT PARK|
|470||DWG USED AS OFFICE|
|471||DWG USED AS RETAIL|
|472||DWG APT CONVERSION|
|474||HEAVY EQUIPMENT SALES/RENTAL|
|482||COMMERCIAL TRUCK TERMINAL|
|490||MARINE SERV FACILITY|
|493||CONVENIENCE STORE GAS/REPAIRS|
|494||BIG BOX RETAIL|
|499||COMM AUX BUILDING|
|500||RESIDENTIAL VACANT LAND|
|501||VACANT LAND 0-9 ACRES|
|530||RIGHT OF WAY – COMMERCIAL|
|531||RETENTION POND – COMMERCIAL|
|553||H.O.A RECREATIONS AREA|
|556||COMMON AREA OR GREENBELT|
|557||COMM APRTM CONDOS 5-19 UNITS|
|558||COMM APRTM CONDOS 20-39 UNITS|
|559||COMM APRTM CONDOS 40+ UNITS|
|56||CONDO DEVELOPMENTAL LAND|
|57||CONDO GARAGE UNITS|
|599||OTHER RESIDENTIAL STRUCTURE|
|601||HUD PROJ #202|
|602||HUD PROJ #207/223|
|603||HUD PROJ #213|
|604||HUD PROJ #220|
|605||HUD PROJ #221|
|606||HUD PROJ #223|
|607||HUD PROJ #232|
|609||HUD PROJ #236|
|645||OWNED BY METRO HOUSING AU|
|650||OWNED BY BOARD OF EDUCATION|
|670||OWNED BY COLLEGE/UNIV/ACADEMY|
|685||CHURCHES, PUBLIC WORSHIP|
|700||COMMUNITY URBAN RENEWAL|
|730||MUNICIPAL URBAN RENEWAL|
|777||INCOME PRODUCING PARKING LOT|
|840||R.R. – USED IN OPERATION|
|850||R.R. – NOT USED IN OPERATION|
|860||RR-PP – USED IN OPERATION|
|880||P.P. – P.U. – OTHER THAN R.R.|
|90||MOBILE HOME (IN PARK)|
|96||MINOR FIRE DAMAGE|
|97||TOTAL/MAJOR FIRE DAMAGE|
|99||RES AUX BUILDING (NO HOUSE)|
|996||MINOR FIRE DAMAGE – COMM|
|997||TOTAL/MAJOR FIRE DAMAGE – COMM|
|998||TOTAL/MAJOR FIRE DAMAGE – COMM|
|A||No Heat but with AC|
|B||Central Heat with AC|
|C||Wall Furnace with AC|
|D||Electric Heat with AC|
|E||Unit Heat with AC|
|F||Heat Pump with AC|
|G||Floor Furnace with AC|
No union is more profound than marriage, for it embodies the highest ideals of love, fidelity, devotion, sacrifice, and family. In forming a marital union, two people become something greater than once they were. As some of the petitioners in these cases demonstrate, marriage embodies a love that may endure even past death. It would misunderstand these men and women to say they disrespect the idea of marriage. Their plea is that they do respect it, respect it so deeply that they seek to find its fulfillment for themselves. Their hope is not to be condemned to live in loneliness, excluded from one of civilization’s oldest institutions. They ask for equal dignity in the eyes of the law. The Constitution grants them that right.
“Here’s the problem with this kind of thing; for the people who don’t seem to get it.
It isn’t a matter of whether or not you’ve done anything wrong. It isn’t a matter of having anything to hide.
It’s a matter of peace of mind. It’s a matter of exercising your rights without having someone snooping into whatever you do. You think what you do is innocent? That’s great. But it’s not you that you have to worry about. It’s the interpretation of others that matter here. Others who may look at the books you read or the sites you visit or the causes you support or the things you say and read something sinister into it because of their own paranoia and prejudices.
Even if nothing comes of it, the knowledge you’re being watched is enough to put most people on edge. You may find yourself not reading that book or not visiting that site or supporting that cause or saying that thing … all because you worry about what someone else will think.
You shouldn’t have to. That’s the issue.” author unknown
It deeply saddens me when I encounter people who feel that blanket surveillance keeps us “safe”. I, and many many many others feel threatened and invariably less free because of surveillance. The fear of worrying about how other people perceive your actions is a terrible burden — so terrible we constantly tell our children and ourselves “not to worry what other people think”. However, as that becomes impossible, people will inevitably self-censor in order to not be noticed; which may be the worst possible thing that can happen in a “democracy” (represenative or otherwise).
“With the first link, the chain is forged. The first speech censured, the first thought forbidden, the first freedom denied, chains us all irrevocably.”
I happened across 1-meter National Agriculture Imagery Program (NAIP) imagery on PASDA. Excitedly I downloaded the zip file for my county and uncompressed it. Inside was a shp file and a very large sid file. The shp contains what appear to be the photography tracks and not very useful for my purposes (looking at pretty pictures!). I attempted to add the sid as a raster layer only to be greeted by an error telling me that it’s an unsupported filetype.
What is this format?
file was no help, just telling me it’s “data” I hit Google to attempt to figure out what this mystery format was and found out it was called MrSID (Multiresolution Seamless Image Database) an image compression format developed at Los Alamos and patented by LizardTech. (Aside: Why a government agency is giving out data in a proprietary format when suitable open alternatives exist is beyond me.)
The friendly denizens of #qgis confirmed what I found and pointed me at a command line tool from LizardTech that allows one to convert MrSID files into a more open format. (While I generally don’t run proprietary code on my laptop, I sucked it up and did the conversion).
geoexpress_commandlineutils_linux/linux64/GeoExpressCLUtils-22.214.171.12482/bin/mrsidgeodecode -i ortho_1-1_1n_s_pa003_2013_1.sid -o ortho_1-1_1n_s_pa003_2013_1.tiff
BEWARE: This will increase the size of the file 15 to 20 times.
To get some idea of what’s stored in this file you can run
and see that there are 3 bands (Red, Green, and Blue) and actual size of the image.
Driver: GTiff/GeoTIFF Files: ortho_1-1_1n_s_pa003_2013_1.tiff Size is 59112, 56449 Coordinate System is `' Metadata: TIFFTAG_XRESOLUTION=1 TIFFTAG_YRESOLUTION=1 Image Structure Metadata: INTERLEAVE=PIXEL Corner Coordinates: Upper Left ( 0.0, 0.0) Lower Left ( 0.0,56449.0) Upper Right (59112.0, 0.0) Lower Right (59112.0,56449.0) Center (29556.0,28224.5) Band 1 Block=59112x1 Type=Byte, ColorInterp=Red Band 2 Block=59112x1 Type=Byte, ColorInterp=Green Band 3 Block=59112x1 Type=Byte, ColorInterp=Blue
This tiff, despite being 10GB loaded just peachy in QGIS. However, I wanted some of that disk space back, so I was told to convert it to a jpeg image. Additionally, the file does not contain it’s own projection (and probably is relying on the prj file in the same directory), so I decided to add that into the jpeg.
gdal_translate -co TILED=YES -co COMPRESS=JPEG -co PHOTOMETRIC=YCBCR -co JPEG_QUALITY=85 -a_srs ortho_1-1_1n_s_pa003_2013_1.prj ortho_1-1_1n_s_pa003_2013_1.tiff ortho_1-1_1n_s_pa003_2013_1.jpeg
gdalinfo shows that we converted it, it’s the same size, and it’s the correct projection.
Driver: GTiff/GeoTIFF Files: ortho_1-1_1n_s_pa003_2013_1.jpeg Size is 59112, 56449 Coordinate System is: PROJCS["NAD_1983_UTM_Zone_17N", GEOGCS["GCS_North_American_1983", DATUM["D_North_American_1983", SPHEROID["GRS_1980",6378137,298.257222101]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433]], PROJECTION["Transverse_Mercator"], PARAMETER["latitude_of_origin",0], PARAMETER["central_meridian",-81], PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",500000], PARAMETER["false_northing",0], UNIT["metre",1, AUTHORITY["EPSG","9001"]]] Metadata: AREA_OR_POINT=Area TIFFTAG_XRESOLUTION=1 TIFFTAG_YRESOLUTION=1 Image Structure Metadata: COMPRESSION=YCbCr JPEG INTERLEAVE=PIXEL SOURCE_COLOR_SPACE=YCbCr Corner Coordinates: Upper Left ( 0.0, 0.0) Lower Left ( 0.0,56449.0) Upper Right (59112.0, 0.0) Lower Right (59112.0,56449.0) Center (29556.0,28224.5) Band 1 Block=256x256 Type=Byte, ColorInterp=Red Band 2 Block=256x256 Type=Byte, ColorInterp=Green Band 3 Block=256x256 Type=Byte, ColorInterp=Blue
In #qgis I was told that it’s usually prudent to precalulate overviews (which are similar to zoom-levels in a TMS). This calculates an image pyramid, but not exactly that generates images that can be used to quickly display the image when zoomed in and out without having to compute it from the base blocks every time.
gdaladdo -r gauss ortho_1-1_1n_s_pa003_2013_1.jpeg 2 4 8 16 32 64 128 256 512
Will compute overviews for half, quarter, eight, &c of the entire image (Note that these are powers of 2, as that works best because it becomes easy to calculate the size of the pyramid).
gdalinfo shows that we’ve added overviews.
Driver: GTiff/GeoTIFF Files: ortho_1-1_1n_s_pa003_2013_1.jpeg Size is 59112, 56449 Coordinate System is: PROJCS["NAD_1983_UTM_Zone_17N", GEOGCS["GCS_North_American_1983", DATUM["D_North_American_1983", SPHEROID["GRS_1980",6378137,298.257222101]], PRIMEM["Greenwich",0], UNIT["degree",0.0174532925199433]], PROJECTION["Transverse_Mercator"], PARAMETER["latitude_of_origin",0], PARAMETER["central_meridian",-81], PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",500000], PARAMETER["false_northing",0], UNIT["metre",1, AUTHORITY["EPSG","9001"]]] Metadata: AREA_OR_POINT=Area TIFFTAG_XRESOLUTION=1 TIFFTAG_YRESOLUTION=1 Image Structure Metadata: COMPRESSION=YCbCr JPEG INTERLEAVE=PIXEL SOURCE_COLOR_SPACE=YCbCr Corner Coordinates: Upper Left ( 0.0, 0.0) Lower Left ( 0.0,56449.0) Upper Right (59112.0, 0.0) Lower Right (59112.0,56449.0) Center (29556.0,28224.5) Band 1 Block=256x256 Type=Byte, ColorInterp=Red Overviews: 29556x28225, 14778x14113, 7389x7057, 3695x3529, 1848x1765, 924x883, 462x442, 231x221, 116x111 Band 2 Block=256x256 Type=Byte, ColorInterp=Green Overviews: 29556x28225, 14778x14113, 7389x7057, 3695x3529, 1848x1765, 924x883, 462x442, 231x221, 116x111 Band 3 Block=256x256 Type=Byte, ColorInterp=Blue Overviews: 29556x28225, 14778x14113, 7389x7057, 3695x3529, 1848x1765, 924x883, 462x442, 231x221, 116x111
Now, how does this compare with the original SRID in terms of size?
|JPEG (with overviews)||1.6G||2.5|
In terms of quality, I haven’t noticed a difference.