Instruction Manual: Making the system highway list (.csv file)

 

Purpose of the files

  • Each highway system needs to have two .csv files that list basic information about each route that is part of the system. The Chopped Routes File (e.g., usai.csv) tells our scripts which files to look for and to load into the web site. The Connected Routes File (e.g., usai_con.csv) lists how those chopped routes are connected across boundaries to make full-length routes.
  • Make a spreadsheet with all the needed info for each of these files. Then save the spreadsheet as a .csv file as explained below. The .csv file is the one we need. The spreadsheet is to help you get the .csv format correct.

Chopped Routes File (e.g., usai.csv) format:

  • Make a spreadsheet in OpenOffice or a similar spreadsheet program with the name systemcode.xls. Replace systemcode with the lowercase system code assigned to your system. E.g., the spreadsheet for the Ohio State Highways is usaoh.xls (system code = "usaoh"). The .csv file is usaoh.csv.
  • The spreadsheet must have 8 columns and 1 header row. Use the header row to label the columns, not to enter the first highway. In processing, the first row is always ignored.

    The columns:
  1. System: the short, all-lowercase system code for the system. It should appear in every row of the spreadsheet even though it seems redundant. It is also used for the filename above.
  2. Region: For undivided countries, this is the uppercase 3-letter country abbreviation in which the highway is located. IDN for Indonesia, LUX for Luxembourg, NZL for New Zealand. In the United States and Canada, it's the uppercase state or province abbreviation. FL for Florida, NT for Northwest Territories. For other divided countries (DEU, ESP, MEX, etc.), use the 3-letter country code, followed by a hyphen, followed by the subdivision code (no spaces). DEU-TH for Thuringia, ESP-AR for for Aragon, MEX-BC for Baja California.
  3. Route: The name of the highway, ignoring any banners or qualifiers. No spaces! US34 for US 34, OH17 for OH 17, PA66 for Business PA 66, A7 for French Autoroute A7. Skip hyphens and slashes unless they separate two numbers (M22-1), the route is a US Interstate or Business Interstate (I-80BL), or is a Quebec Autoroute (A-50).
  4. Banner: For bannered routes, put the 3-letter banner abbreviation(s) (Bus, Alt, Spr, etc.) here only if needed. Otherwise leave it completely blank (no whitespace). Lp for Loop is two letters. No more than six characters (for double-bannered routes) are allowed.
  5. Abbreviation: The three-letter city abbreviation if needed for auxiliary highways or for most piecemeal highways. Pit for Truck US 19 (Pittsburgh). Otherwise leave it blank (no whitespace).
  6. City: The spelled-out city name for an auxiliary route or for distinguishing between pieces of the same highway in the same region. Pittsburgh for Truck US 19 (Pittsburgh). This is the text that will appear below the shield in the highway browser. Usually no text is needed except for auxiliary highways, so leave it blank (no whitespace) in most cases. The City should be spelled as the locals spell it and using international characters as needed.
  7. Filename root: The name of the .wpt file with the extension omitted. pa.us019trkpit, oh.oh007, etc. The filename roots are made all-lowercase and follow this formula: (Region without any hyphens) + period (.) + Route (number padded with zeroes for three digits unless the number is 100+) + Banner (if there is one) + Abbreviation (if there is one). FIXME: Some systems have 4-digit routes now.
  8. Alt Route Names A comma-separated list of deprecated route names, or blank if none. This field is blank for most routes and not blank only when necessary to handle route merges or name changes. While the primary route name is usually split into the Route, Banner, and Abbrev columns, any alt route names have these 3 parts concatenated. The result should be whatever would be entered in a .list file for the old name of the route, e.g. I-22FutTup.

Order of the routes in the file:

  • Route numbers are ascending by route number.
    PA3, PA5, PA8, etc.
  • For like route numbers, suffixless routes come first, followed by suffixed routes of the same number.
    MA2, MA3, MA3A, MA3B, MA4, MA4B, MA4H, MA5, etc.
  • For pieces of the same route, put them in the usual order for the country (i.e., south to north or west to east in the US).
    ..., PA42, PA43 Chadville, PA43 Brownsville, PA43 Pittsburgh, PA43AltCal, PA44, etc.
  • Bannered routes of a certain number come immediately after the bannerless route of the same number and before any suffixed routes of the same number. Bannered routes of the same number but with different banners go in order of the banners. Bannered routes of the same number and banner go in the usual order for the country (e.g., south to north or west to east in the US).
    ..., PA42, PA42AltCen, PA42AltBlo, PA42TrkEag, PA42A, PA42ABusPit, PA42B, PA43, ...
  • Doubly bannered routes come right after the matching singly bannered route. The banner immediately after the route number here matches the banner immediately above the route number as signed in a shield.
    US50AltBusDun is the Dunkirk City business route of US50Alt. US50AltBusDun is signed with a Business banner on top, Alternate banner in the middle, and the number shield at the bottom on a roadside sign assembly.
    ..., US50, US50AltGeo, US50AltBusDun, US50BusFay, US50BusTrkUni, US50ScePin, US50TrkSno, US 51, ...

Connected Routes File (e.g., usai_con.csv) format:

  • The Chopped Routes File lists all the highways in the highway system after being chopped at national and sometimes subdivisional (state, province, oblast, etc.) borders. This Connected Routes File gives the information about which chopped routes should be connected to reconstruct each full route.
  • The filename is systemcode_con.csv, with systemcode replaced with the code for the system.
  • The first row is a header row and should never contain info about the first connected highway.
  • Each subsequent row gives info about the connected routes in the same order as they are given in the Chopped Routes File.
  • Columns:
    1. System: system code.
    2. Route: the common Route name, like US52.
    3. Banner: the common Banner, if the route is bannered, or left blank if not.
    4. Name: like the City field of the Chopped Routes File, the Name field is used to distinguish between otherwise identically named routes (same Route and Banner fields) or to give extra info about numberless or bannered/repeatable designations. Like the City field, this field can contain international characters.
    5. Roots: a comma-separated list (no spaces!) of filename roots of the chopped routes that connect to form this connected route. Continuous roots that were chopped at boundaries will have a list of 2+ file roots, while a route that exists entirely in one region will simply have one root. Remove all spaces.
  • Things to fix manually: FIXME: Rework this section, as connected route CSV files are no longer generated by script.
    1. Check the roots in each row to make sure that the chopped routes indeed compose a continuous route. Sometimes border coordinates in the waypoint files do not line up on each side of a border, and this could cause a single connected route to appear in pieces in more than one line. Merge the rows as necessary to fix this problem, always putting the list of file roots in the Roots field in the correct order.
    2. If there is exactly one row with a certain Route+Banner combination, the Name should be blank unless it is a bannered/repeatable route type, in which case the Name should be filled in (see below for devising Names). For example, NY 17 has 3 chopped routes in order NY - PA - NY, so at least one of these has a non-blank City field in the Chopped Routes File that may have been initially copied to the Name field. But there is only one NY17 route in this system, so the Name should be made blank instead.
    3. If there are multiple rows with the same Route+Banner combination, each row should have a nonblank Name. In short, long, multi-region routes will have Names composed from the region names, while shorter, single-region routes will have Names devised identically to the City field in the Chopped Routes File (usually a city name). Here "region" means the country or the subdivision in a divided country.
      • If the route is lengthy and within one region, use the region name as the Name. For example, the southern US 9 is entirely in Delaware, so the Name should be "Delaware" (and not "DE").
      • If the route is lengthy and spans multiple regions, use both region names separated by space-hyphen-space. Put the region names in the usual order for the system (e.g., region containing the southern or western end first if in the US). For example, the northern US 9 spans New Jersey and New York in that order, so the Name should be "New Jersey - New York".
      • If applying these rules results in the same region used in different Names for rows with the same Route+Banner, add an abbreviated direction (NW., W., C., etc.) to that region name in each Name it appears. For example, the western US 422 would be Named "Ohio - W. Pennsylvania" and the eastern US 422 would be Named "E. Pennsylvania". Alternatively, if one route is wholly on a large island within the region (of size like that of Crete or larger), use the island name along with the region abbreviation. For example, E25 has a piece "Netherlands - Italy" that includes mainland France, as well as pieces in the large islands of Corsica, France, and Sardinia, Italy. The three piece Names should be "Netherlands - Italy", "Corse, FRA", and "Sardegna, ITA".
      • If a route is more local (shorter), is a full beltway, or is a bannered/repeatable route type (mandatory City and Abbrev in the Chopped Routes File), then the Name should be devised in the same way as the City field in the Chopped Routes File (and in most cases, the Name and City fields should be identical).
      • If there are two or more discontinuous routes of the same Route+Banner combination, each wholly within the same, single region, and one is significantly longer than the rest, the above rules would suggest using the region name for the long piece and city names for the short ones. Since the region name as the Name doesn't distinguish the pieces well, change the Name to either "(Main)" (parentheses included) or give this piece a city name. For example, if there are 200-mile and 8-mile disconnected sections of US 47 in the same state, use "(Main)" or a city for the long section and a city name for the short section.

International characters:

  • The Route, Abbreviation, Banner, and of course the Filename Root must be devoid of international characters since these appear in .list files and as filenames. Pick the closest character without any diacritical marks, e.g., "o" for "ö", if it would appear in the Abbreviation field: "Kol" for Köln.
  • The City and Name fields should use the native language name for a place, and it may use international characters. "München", not "Munchen" nor "Munich".
  • International characters must be properly encoded. OpenOffice Calc will do this when saving the spreadsheet as a .csv file (see below). Type the international characters into the spreadsheet. Choose "UTF-8" when saving as a .csv file.
  • Excel doesn't cooperate with this, so if using Excel, consider downloading OpenOffice (it's free), or if you insist on using Excel, you must manually encode the characters. See the UTF-8 to Latin character table.
  • International characters will appear as weird symbols or question marks on the web site if the encoding isn't properly done. This is bad!

The .csv file (the file to submit!)

  • I need a .csv file generated from the spreadsheet you made. I do not need the spreadsheet.
  • The .csv filenames use the system code: usact.csv and usact_con.csv for the system with system code "usact".
  • Spreadsheet programs can save a sheet of a spreadsheet (but not the whole spreadsheet) as a delimited file.
  • Use a semicolon (not a comma!) as the field delimiter, and don't use any text delimiters (no quotes!). In OpenOffice Calc, the default delimiters are comma and double quotes, but instead you must specify semicolons and none for the two types. Saving in UTF-8 encoding is necessary to ensure that international characters are properly handled.
  • In OpenOffice Calc, the three correct options when saving the sheet as a .csv file are: UTF-8 encoding, semicolon delimiter, no text delimiter.
  • Excel will not use semicolon delimiters when saving a .csv file. It will save the file using commas. This is no good. You can do a search and replace on the .csv file to turn the commas into semicolons, but beware: sometimes commas are part of the City and Roots fields. Best advice: download and use OpenOffice Calc.

Whenever a route is added to or deleted from a system, submit an updated .csv file of both types and mention the change in the email.

Example system: Takoma National H Routes (tach)

Takoma is the country, and it has 3 states:
NT - North Takoma
CT - Central Takoma
ST - South Takoma
(Or Takoma could be some large area with these as 3 countries. It does not matter.)

The national H route system spans the 3 states with routes H1 - H4.

Map:

H1 is in NT only.
H2 passes through ST, CT, and NT.
H3 has two parts, one spanning ST and CT, and the other in NT.
H3 has a bannered Alt route around Capital City, CT.
H4 has two parts, both in NT. The Springfield part is the main part, and the Shelbyville part is shorter.

The chopped routes file tach.xls looks like this*:

It lists each highway once per region. This is the same format we've been using since 2009*.

*This section of the manual was written before the AltRouteNames column was added to the Chopped Routes files. The AltRouteNames column should be included in these files, even though these examples don't show it.

The connected routes file tach_con.xls looks like this:

Each line represents one route in total rather than one in each region. The Roots column shows a list of the file root names that comprise that route, delimited by commas only and not also with spaces.

The Name column takes on a role similar to the City column of the chopped routes file. It is a short amount of text to distinguish between routes with identical Route and Banner fields.

The Region and Abbrev columns do not appear in the connected routes file because my scripts can look up this info from the chopped roots file by connecting the two files via the file roots. So this info is not repeated in the .csv files.

Since H1 and H2 are unique after being connected across the boundaries, no Name is necessary.

There are multiple H3's and H4's, so each piece gets a Name. The two-state H3 Name lists the states at its ends with a hyphen in between. If there were more states and it spanned more than two, only the two endpoint states would be the ones listed. The other H3 is within a single state, so that state is the Name.

The two H4s are in the same region, so the Name is the main city that each serves, and in fact it is the City from the chopped routes file.

The City for bannered routes like H3 Alt is mandatory in the chopped routes file, and the same city is used as the Name in the connected routes file.

Many of our highway systems do not have highways spanning more than one region. For example, the Pennsylvania State Highways are completely within Pennsylvania. This means that all the rows of the connected routes file will have a single file root. The bannered routes like PA 8 Truck and the duplicated routes like PA 29 and PA 97 will need Name fields.

In some other systems, there are only a small number of routes that cross a border without changing designation. For example, the New York State Highways, has only NY 17 running into PA and NY 120A dipping into CT. Most rows of the connected routes file will have a single file root, but the row for NY 17 and the row for NY 120A will have a few roots separated by commas.

In the major multi-region-spanning systems, like the I & US highways in the US, the TCH in Canada, the M and A routes in Great Britain, and the Int'l E Roads, the rows of the connected routes file will vary greatly in the number of file roots listed.

Concurrencies within which not all concurrent routes are signed

This section concerns typically well signed routes that whose numbers are signed with trailblazers or are not signed at all within a section of highway concurrent with other routes. For example, France's A4 and A26 merge and split, but along the merged section, A4 is signed and A26 is not, but both routes are signed beyond the concurrent section. Should the not-signed routes be chopped into its signed pieces or made continuous and concurrent with the signed route?

We have 4 cases that are treated differently. The descriptions refer to concurrencies of two routes, but the ideas generalize to concurrencies of more routes.

  1. Unsigned but implied multiplexes: Treat as continuous routes.

    This is the case where only one route is signed where another one route merges onto the same road. Usually the unsigned route splits off at another point, then it's signed beyond the concurrency. Continuity is still implied by the way the routes are numbered even if the signs are simplified to show only one route, so we treat each route as a continuous one.

    Examples:

    USA MD 23/MD 165: MD 23 was continuously signed before a relocation that created the duplex. In the current state, MD 23 is signed as "TO MD 23" at its approaches to the duplex, and MD 165 is signed continuously. MD 23 should continue to be treated as continuous.

    ENG A414: Follow the length of A414 and you'll see several concurrent routes, sometimes shown as A414 and sometimes as the other route, at least as Google Maps shows it. A system of surface highways with a bypass here and there is bound to be full of concurrencies, and so chopping half the routes into pieces around the concurrent parts would create a zillion "extra" files for short pieces of routes.

    FRA A4/A26: The two freeways merge and split. The pieces of A26 could have been given different numbers, but instead they were given the same number, as if it should be one long route rather than two.

  2. Bypassed, segmented routes: Discontinuous routes.

    Here some pieces of an old route were bypassed by a new route, but other pieces of the old route were upgraded into the new route. This makes a continuously signed new route with pieces of the old route beginning and ending at various places along the new route.

    Examples:

    Bannered highways, like Alternate and Business routes, in the US: Many US highways, for example, have many auxiliary routes with the same designation, like US 40 having many US 40 Business routes. The auxiliary routes are treated discontinuously, rather than having one long, continuous US 40 Business concurrent along sections of US 40.

    US 40/MD 144: There are several pieces of MD 144 along the old alignment of US 40. The pieces act like Business or Alternate routes and are never signed to suggest continuity.

  3. Alternating designation: Discontinuous routes.

    A road changes designations back and forth without either route splitting off on its own.

    Example:

    Ireland's M/N routes come to mind here. Part of N8 was upgraded to M8, but there is no alternative N8 along that section. However, N8 leads straight into M8 at each end of M8. So if the highway goes N8-M8-N8, we'll have three files for these three routes.
  4. Like designations that aren't concurrent: Discontinuous routes.

    By whatever reasoning, two unrelated, distant highways were given the same designation.

    Example:

    PA 97 (in NW Pennsylvania) and PA 97 (in southern PA), both part of the state highway system in Pennsylvania.


 

CHM Home | Track your cumulative travel | Highway Browser | Traveler Stats | Newly added highways | Concurrency List
Collaboration | Acknowledgments | Report a highway data error | Discussion Forum
Have you traveled on any highways in the Iowa State Highways system?