@@ -24,9 +24,11 @@ The Python Shapefile Library (pyshp) reads and writes ESRI Shapefiles in pure Py
24
24
- [ Adding Records] ( #adding-records )
25
25
- [ File Names] ( #file-names )
26
26
- [ Saving to File-Like Objects] ( #saving-to-file-like-objects )
27
- - [ Working with Large Shapefiles] ( #working-with-large-shapefiles )
28
27
- [ Python Geo Interface] ( #python-geo-interface )
29
- - [ Testing] ( #testing )
28
+ - [ Working with Large Shapefiles] ( #working-with-large-shapefiles )
29
+ - [ Unicode and Shapefile Encodings] ( #unicode-and-shapefile-encodings )
30
+
31
+ [ Testing] ( #testing )
30
32
31
33
# Overview
32
34
@@ -714,6 +716,35 @@ write them.
714
716
>>> # Normally you would call the "StringIO.getvalue()" method on these objects.
715
717
>>> shp = shx = dbf = None
716
718
719
+ ## Python Geo Interface
720
+
721
+ The Python \_\_ geo_interface\_\_ convention provides a data interchange interface
722
+ among geospatial Python libraries. The interface returns data as GeoJSON which gives you
723
+ nice compatibility with other libraries and tools including Shapely, Fiona, and PostGIS.
724
+ More information on the \_\_ geo_interface\_\_ protocol can be found at:
725
+ [ https://gist.github.com/sgillies/2217756 ] ( https://gist.github.com/sgillies/2217756 ) .
726
+ More information on GeoJSON is available at [ http://geojson.org ] ( http://geojson.org ) .
727
+
728
+
729
+ >>> s = sf.shape(0)
730
+ >>> s.__geo_interface__["type"]
731
+ 'MultiPolygon'
732
+
733
+ Just as the library can expose its objects to other applications through the geo interface,
734
+ it also supports receiving objects with the geo interface from other applications.
735
+ To write shapes based on GeoJSON objects, simply send an object with the geo interface or a
736
+ GeoJSON dictionary to the shape() method instead of a Shape object. Alternatively, you can
737
+ construct a Shape object from GeoJSON using the "geojson_as_shape()" function.
738
+
739
+
740
+ >>> w = shapefile.Writer()
741
+ >>> w.field('name', 'C')
742
+
743
+ >>> w.shape( {"type":"Point", "coordinates":[1,1]} )
744
+ >>> w.record('two')
745
+
746
+ >>> w.save('shapefiles/test/geojson')
747
+
717
748
## Working with Large Shapefiles
718
749
719
750
Despite being a lightweight library, PyShp is designed to be able to read and write
@@ -756,43 +787,43 @@ process and write any number of items, and even merging many different source fi
756
787
large shapefile. If you need to edit or undo any of your writing you would have to read the
757
788
file back in, one record at a time, make your changes, and write it back out.
758
789
759
- ## Python Geo Interface
790
+ ## Unicode and Shapefile Encodings
760
791
761
- The Python \_\_ geo_interface\_\_ convention provides a data interchange interface
762
- among geospatial Python libraries. The interface returns data as GeoJSON which gives you
763
- nice compatibility with other libraries and tools including Shapely, Fiona, and PostGIS.
764
- More information on the \_\_ geo_interface\_\_ protocol can be found at:
765
- [ https://gist.github.com/sgillies/2217756 ] ( https://gist.github.com/sgillies/2217756 ) .
766
- More information on GeoJSON is available at [ http://geojson.org ] ( http://geojson.org ) .
792
+ PyShp has full support for unicode and shapefile encodings, so you can always expect to be working
793
+ with unicode strings in shapefiles that have text fields.
794
+ Most shapefiles are written in UTF-8 encoding, PyShp's default encoding, so in most cases you don't
795
+ have to specify the encoding. For reading shapefiles in any other encoding, such as Latin-1, just
796
+ supply the encoding option when creating the Reader class.
767
797
768
798
769
- >>> s = sf.shape(0 )
770
- >>> s.__geo_interface__["type" ]
771
- 'MultiPolygon'
799
+ >>> r = shapefile.Reader("shapefiles/test/latin1.shp", encoding="latin1" )
800
+ >>> r.record(0) == [2, u'Ñandú' ]
801
+ True
772
802
773
- Just as the library can expose its objects to other applications through the geo interface,
774
- it also supports receiving objects with the geo interface from other applications.
775
- To write shapes based on GeoJSON objects, simply send an object with the geo interface or a
776
- GeoJSON dictionary to the shape() method instead of a Shape object. Alternatively, you can
777
- construct a Shape object from GeoJSON using the "geojson_as_shape()" function.
803
+ Once you have loaded the shapefile, you may choose to save it using another more supportive encoding such
804
+ as UTF-8. Provided the new encoding supports the characters you are trying to write, reading it back in
805
+ should give you the same unicode string you started with.
778
806
779
807
780
- >>> w = shapefile.Writer()
781
- >>> w.field('name', 'C')
782
-
783
- >>> w.shape( {"type":"Point", "coordinates":[1,1]} )
784
- >>> w.record('one' )
808
+ >>> w = shapefile.Writer(encoding="utf8" )
809
+ >>> w.fields = r.fields[1:]
810
+ >>> w.record(*r.record(0))
811
+ >>> w.null( )
812
+ >>> w.save("shapefiles/test/latin_as_utf8.shp" )
785
813
786
- >>> shape = shapefile.geojson_to_shape( {"type":"Point", "coordinates":[2,2]} )
787
- >>> shape.shapeType
788
- 1
789
- >>> shape.points
790
- [[2, 2]]
814
+ >>> r = shapefile.Reader("shapefiles/test/latin_as_utf8.shp", encoding="utf8")
815
+ >>> r.record(0) == [2, u'Ñandú']
816
+ True
791
817
792
- >>> w.shape(shape)
793
- >>> w.record('two')
818
+ If you supply the wrong encoding and the string is unable to be decoded, PyShp will by default raise an
819
+ exception. If however, on rare occasion, you are unable to find the correct encoding and want to ignore
820
+ or replace encoding errors, you can specify the "encodingErrors" to be used by the decode method. This
821
+ applies to both reading and writing.
794
822
795
- >>> w.save('shapefiles/test/geojson')
823
+
824
+ >>> r = shapefile.Reader("shapefiles/test/latin1.shp", encoding="ascii", encodingErrors="replace")
825
+ >>> r.record(0) == [2, u'�and�']
826
+ True
796
827
797
828
# Testing
798
829
0 commit comments