Each graphical representation is much more effective if spatial data is used. Tableau supports Spatial data, and .SHP files are the most commonly used source of data, but what are these?.
What is Shapefile?
The shapefile format is a geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a mostly open specification for data interoperability among Esri and other GIS software products. The shapefile format can spatially describe vector features: points, lines and polygons.
The term “shapefile” is quite common, but the format consists of a collection of files with a common filename prefix, stored in the same directory. The three mandatory files have filename extensions .shp, .shx, and .dbf.
A Spatial Reference System Identifier (SRID) is a unique value used to unambiguously identify projected, unprojected, and local spatial coordinate system definitions. These coordinate systems form the heart of all GIS applications. Most often there is an .xml or .prj file to help us see which SRID we are dealing with.
If it is a large amount of data, it should be transferred to the database, and Tableau supports direct connections to spatial data using these connectors:
- Microsoft SQL Server
- PostgreSQL + PostGIS
- Pivotal Greenplum + PostGIS
When you need to accurately enter coordinates in a GIS, the first step is to uniquely define all coordinates on Earth.
This means you need a reference frame for your latitude and longitude coordinates because where would you be on Earth without having reference to it?
Because the Earth is curved – and in GIS we deal with flat map projections – we need to accommodate both the curved and flat views of the world. Surveyors and geodesists have accurately defined locations on Earth. More about map projections at https://en.wikipedia.org/wiki/List_of_map_projections and https://en.wikipedia.org/wiki/Map_projection
We begin modelling the Earth with an ellipsoid. To be clear, this is different from a geoid which is a measure of mean sea level. Over time, the ellipsoid has been estimated to the best of our ability through a massive collection of surface measurements.
When you combine these measurements, we arrive at a geodetic datum. Horizontal datums precisely specify each location on Earth’s surface in latitude and longitude or other coordinate systems. For instance, NAD83, WGS84 and ETRS89 are examples of geodetic datums.
Using geographic coordinate systems, we can define positions on Earth. For example, we use degrees/minutes/seconds (DMS) and decimal degrees (DD) for spherical coordinates. After we flatten the Earth with map projections, we use projected coordinate systems like the State Plane Coordinate System (SPCS) or UTM grid system.
In Tableau, the following geodetic datums (SRIDs) are supported:
- NAD83 (EPSG:4269)
- ETRS89 (EPSG:4258)
- WGS84 (EPSG:4326)
The North American Datum of 1983 (NAD 83) is the most current datum being used in North America. It provides latitude and longitude and some height information using the reference ellipsoid GRS80. Geodetic datums like the North American Datum 1983 (NAD83) form the basis of coordinates of all horizontal positions for Canada and the United States.
In 1990 at it symposium in Firenze, Italy, EUREF designed the European Terrestrial Reference System 89 (ETRS89) in such a way that it would be based on the ITRS except that it is tied to the stable part of Europe, so that the relations between European stations are kept fixed. Coordinates in ETRS89 are expressed as either three dimensional (X, Y, Z) Cartesian coordinates or as three dimensional ellipsoidal coordinates (Φ, λ and H, Ellipsoidal height), based on the GRS80 ellipsoid.
The acceptance of ETRS89 by several communities (civil aviation, industry, national and regional agencies) as the backbone for geo-referencing in Europe is a continuous process. EuroControl (European Agency for Safely of Air navigation) uses ETRS89 since a long time in its technical specifications and there is a recommendation by the European Commission to adopt ETRS89 as the geodetic datum for geo-referenced information and to promote the use of ETRS89 within its member states.
The Global Positioning System uses the World Geodetic System (WGS84) as its reference coordinate system.
In spatially enabled databases (such as IBM DB2, IBM Informix, Ingres, Microsoft SQL Server, MySQL, Oracle RDBMS, Teradata, PostGIS, SQL Anywhere and Vertica), SRIDs are used to uniquely identify the coordinate systems used to define columns of spatial data or individual spatial objects in a spatial column.
Remember: If you must show continent-wide or worldwide data, then use WGS84, if you are lowering the level of information to a region or city, use a national projection.
To move data from Shapefiles to a database, we can use Open Source tools to do all that quickly!I would recommend to use ogr2ogr (https://gdal.org/programs/ogr2ogr.html) because the data may be in a projection used in your country and should be transferred to a projection that Tableau supports.
Using ogr2ogr is really easy:
ogr2ogr -f "MSSQLSpatial" "MSSQL:server=<YOUR DB SERVER>;Database=<YOUR DATABASE NAME>;Trusted_Connection=yes" "<YOUR SHAPEFILE NAME>" -a_srs "EPSG:<YOUR SHAPEFILE SRID>"
How to use ogr2ogr for changing the coordinate reference system from EPSG:4326 to EPSG:3857:
ogr2ogr -s_srs EPSG:4326 -t_srs EPSG:3857 -f GPKG output.gpkg input.gpkg
If the data is stored in Database in WGS84 or ETRS89 but you have Geometry and you need Geography, we can use the following trick:
We must add 2 fields to the tables we want to convert:
We need Geometry and Geography fields:
FGeometry as Geometry and Geom2 as Geography;
1) Step one for transformation Geometry
SELECT d.OID, d.Geometry,
CONVERT(varbinary(max), d.Geometry, 1) as ConvertedGeometryBin,
(cast(CONVERT(varbinary(max), d.Geometry, 1) as geometry)) as FGeometry
FROM MyTable d;
2) Update our table with new values for Geometry
UPDATE MyTable
SET FGeometry = (cast(CONVERT(varbinary(max), Geometry, 1) as geometry));
3) Update our table with Geography data
UPDATE MyTable
SET Geom2 = FGeometry.STAsText();
4) Select data to see difference between Geography and Geometry data
SELECT d.OID, d.FGeometry, d.Geom2.STAsText(), d.FGeometry.STSrid
FROM MyTable d;
5) Copy desired columns into new table with Geography as Geometry
INSERT INTO table_new (geog) -- add fields to copy
SELECT GEOGRAPHY::STGeomFromWKB(table_old.geom.STAsBinary(), 4326)
FROM table_old;
Once you transfer Shapefile to DB, you can really do quality analysis over the data.
Although databases allow points, lines and areas to be in the same table, it is preferable that each element goes to its own table in the database.