Golang has a lot built in, compared to C; for example, it has garbage collection, an interesting new way to do ad-hoc polymorphism, parametrically polymorphic finite maps, a range type, lightweight and relatively safe concurrency constructs, complex numbers, type-safe variadic parameters, run-time type identification, exception handling with FIFO automatic cleanup, a string type with equality and concatenation, and a print statement. However, without importing at least some packages, there’s no way for a Go program to obtain input, and there are a lot of facilities that are easily accessible in the standard library that can save you a lot of time.
I’m just starting to learn Golang, so I’m taking some notes on the standard library from http://localhost:6060/pkg/, which is unfortunately organized alphabetically, which means that really recondite stuff is mixed in with really basic stuff. So this is my attempt to summarize what I think is most important.
The standard Golang library seems to be entirely lacking facilities for interactive terminal I/O and GUIs.
These are all you need for a pretty wide range of stuff: testing
,
io
, fmt
, os
, strconv
, bytes
, strings
, math
, and sort
.
Without any one of these, you’d be pretty handicapped.
testing
go test
runs automated test suites written using functions named
TestFoo
, that take pointers to testing.T
objects, in foo_test.go
files. It also has benchmarking and doctest-like functionality. I
have to admit I haven’t tried this yet because so far I haven’t
written any Golang packages, just standalone programs, and I’m not
sure how to use it with those.
The test/quick
subpackage does generative property-based testing,
like Hypothesis.
fmt
fmt
does formatted I/O, including Printf
. It can use reflection
to dump out structs (%v
, applicable to any type, or %+v
to see
field names) and even Golang-syntax representations (%#v
).
Naturally there’s a fmt.Formatter
interface you can implement to
override the default formatting.
Because of reflection, fmt.Fscan
, fmt.Scan
, fmt.Sscan
, etc.,
don’t need a format string at all — you just give them some interface
values pointing at where you want to store the results — and the same
is true of fmt.Print
. Scanning can be overridden with a custom
Scan
method.
There’s also a print
function you can use without importing fmt
.
io
io
is where you find the Reader and Writer protocols, among others;
fmt.Fprintf
takes an io.Writer
rather than a file. It also
contains things like io.Copy
, which normally copies a Reader
to a
Writer
, but when possible uses more efficient methods, which I
assume means sendfile(2)
; plumbing utilities like io.LimitReader
,
io.LimitWriter
, io.Pipe
, io.MultiReader
(cat), io.MultiWriter
(tee), and io.TeeReader
(tee); io.ReaderAt
, suitable for
concurrent random-access record I/O from multiple goroutines; etc.
An interesting difference from Unix (or Python) is that Read()
is
supposed to return the error io.EOF
at EOF rather than just an empty
byte count. This has the annoying result that if you import "os"
and do file I/O with it, you probably also need to import "io"
.
The subpackage io/ioutil
contains ReadFile
, WriteFile
, and
tempfiles.
os
os
is where you find Args
and most of the Unix API, including
things like Open(filename)
, Getpid()
, and os.Stdout
, which is an
unbuffered io.Writer
, which you can buffer using bufio
. Opening a
file for writing requires calling either os.Create
or os.OpenFile
.
There’s a separate File.WriteString
method for when you have a
string
rather than a []byte
to write to a file. This is sort of
strange because you can convert from string
to []byte
with
[]byte(s)
, which makes me think this method may be left over from an
earlier version of Golang where you couldn’t do that.
The design is a mix of very Unixy and somewhat portable. File
permissions are a 32-bit int with no place to put, for example, a
separate “delete” permission bit, as on VMS. But the inode returned
by os.Stat
(FileInfo
) isn’t even a concrete type at all, but an
interface. (And it includes the file’s basename, but not e.g. ctime,
so it’s only sort of an inode.)
The os.Process
API is an interesting design, clearly designed with
non-Unix systems in mind. (There’s no os.Fork
!) It’s somewhat
weak; there’s no nonblocking waitpid
, for example, and thus no way
to get an os.ProcessState
for a running process! os.exec
has a
more convenient interface, but it’s apparently built on os.Process
and therefore can’t do anything os.Process
can’t.
There also doesn’t seem to be a way to call select(2), although of course you can spawn off a goroutine or two per fire descriptor.
strconv
strconv
is where you look for conversion to and from strings,
although fmt.Sprintf
gives you a more capable way to build strings.
bytes
Things you would expect to be utility methods on strings (ToUpper
¸
Join
, Split
, TrimRight
, Compare
, HasPrefix
) are here, as are
io.Reader
and io.Writer
interfaces for in-memory “files”. There’s
a corresponding strings
module for strings.
There are actually two different io.Reader
implementations: the
read-write bytes.Buffer
and the seekable bytes.Reader
.
bytes.Buffer
, in addition to converting output-generation functions
into string-building functions, also allows you to buffer your output
in a more controllable way than the bufio
module mentioned
below — it guarantees to return no errors when you write to it (it
will panic
instead if it runs out of memory) and allows you to
accumulate the bytes written until you're ready to do something more
interesting with them, like send them over a socket.
strings
This is almost the same as the bytes
package, but for strings, even
to the point of including the unnecessary Compare
function.
Naturally, though, it omits the read-write Buffer
interface.
math
This has the usual set of floating-point functions; you know, Acosh
,
Cos
, Ceil
, IsNaN
, Max
, Bessel functions, Erfc
, and so on.
Angles are in radians, logarithms are natural where not otherwise
specified. There are functions to convert floats to and from raw
bits.
sort
sort
provides an exchange sort, with stable and unstable variants,
and it’s somewhat less comfortable than its Python equivalent not only
because you have to import it explicitly, but because it requires that
the sequence you’re sorting implement sort.Interface
, which is
impossible for raw slices. (The sequence implements the interface,
not the data items being sorted.)
This module contains heapsort, insertion sort, binary quicksort with some optimized pivot selection, and mergesort; the quicksort fails over to heapsort if it goes badly, an approach sometimes known as “introsort”.
The mergesort is apparently a 21st-century optimization of binary mergesort which uses only logarithmic space rather than the usual linear space, at the cost of some extra swaps.
The package also implements binary search as sort.Search
,
generalized to the case of a general function over integers 0..n
(i.e. not just for searching sorted sequences in memory).
For convenience, there are a number of wrappers for sorting and searching built-in data types.
These are very common things but not quite as absolutely basic as the
ones listed above. These are flag
, log
, bufio
, math/rand
,
time
, net/http
, net/url
, net/mail
, mime
, regexp
,
encoding/gob
, encoding/json
, encoding/binary
, image
, C
, and
yacc
.
flag
This is the standard way to implement command-line parsing. For better or worse, it doesn’t support combinable Unix-style single-letter flags, just long-only options, and it doesn’t support GNU-style options following non-option arguments.
log
log
is the standard way to log errors and other messages. This is
how you debug your programs, and log.Fatal
or log.Fatalf
is how
you report fatal errors and exit. There’s no tagging, logging levels,
complex object serialization, or filtering, but you can set a prefix
and twiddle a couple of formatting flags. You can create different
log.Logger
objects with log.New
, sending information to the same
or different files, and pass them around as you like.
I wanted to put this in the “Really basic packages” category, since I
import it in every Golang program I write, but the truth is that you
can get by without it a lot more easily than you can get by without
math
, sort
, or strconv
.
bufio
This adds I/O buffering to an io.Writer
(for performance) or an
io.Reader
(so you can put back already-read bytes or runes, for
parsing reasons, as fmt.Fscan
does). It also contains
bufio.Scanner
, which parses things like (limited) CSV. (There’s
also a CSV parser in encoding/csv
.)
math/rand
This is a random number generator, which includes a shuffler in the
form of rand.Perm
.
time
This package mixes together access to the real-time clock
(time.Now()
), calendrical calculations, formatting, time zones, and
scheduling. The time format used internally is nanosecond-precision
and has its zero value at the beginning of year 1 CE in UTC.
This package includes a syntax for durations, which are int64 nanosecond counts; the syntax accepts “300μs”. ♥
Because Go has no operator overloading, the Time type has .Add and .Sub methods to interact with Duration.
net/http
This includes a fairly easy HTTP/1.1 and HTTP/2 client and server with TLS support. An HTTP server fits in a Tweet:
package main
import . "net/http"
func main() {
HandleFunc("/", func(w ResponseWriter, r *Request) { w.Write([]byte("hello")) })
ListenAndServe(":8080", nil)
}
This also provides a convenient interface to the query-string parsing
in net/url
; the http.Request
object above has a FormValue
method.
net/url
net/url
does URL parsing, construction, escaping, unescaping,
relative URL resolution, and query-string encoding and decoding.
u, err := url.Parse("http://bing.com/search?q=dotnet")
net/mail
net/mail
parses RFC-822 (well, RFC-5322) mail messages and
addresses, but not MIME, and it doesn’t format mail messages, just
parses them.
mime
mime
implements “parts of the MIME spec”, including reading
/etc/mime.types and the b-encoding and q-encoding used in mail
headers. Subpackages implement quoted-printable and multipart
encoding, but I don’t think there’s anything here that will give you a
decoded email message body directly; this package is more aimed at
handling mail headers and HTTP messages.
regexp
This implements a fairly Perl-compatible regexp engine, including
non-greediness and the Python extension of named (?P<foo>bar)
capture groups, but with no backreferences. Given the history of Russ
Cox and Ken Thompson in writing high-performance regular expression
libraries, using DFAs that can’t handle backreferences, you would
think that this library would be super fast, but usually it seems to
be noticeably slower than Python’s and Perl’s. It does, however,
avoid the exponential behavior characteristic of NFA engines in cases
like this:
// Pathological regexp.
// The Python equivalent takes exponentially long:
// __import__('re').compile('(x|xx)*y').match('xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
package main
import "regexp"
func main() {
regexp.MustCompile("(x|xx)*y").FindString("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
}
encoding/gob
This is a generic serialization/deserialization system, using a
Go-specific serialization, and it seems to be the closest equivalent
to Python’s pickle
, though it doesn’t support circular references.
It supports maps, structs, arrays, slices, bools, signed and unsigned
integers, floating-point numbers, complex numbers, and strings;
structs include only exported fields, and functions and channels are
omitted. Interface types are supported with a little more hassle.
It is not necessary to register user-defined types before sending them or for them to implement any particular interface, although there is an interface that they can define to override the default marshaling. There is some limited support for schema evolution even in the default marshaling.
encoding/json
json.Marshal
converts a Golang value into JSON; it supports bool,
numbers, strings, arrays, slices, maps from string, and structs
(except for unexported fields or fields tagged with “-”.) This is
actually the only use I’ve seen so far of struct field tags.
Naturally, of course you can override the JSON serialization by implementing an interface.
json.Unmarshal
takes an optionally-nil destination argument, which I
suppose allows it to determine what types to deserialize things as.
This module does support reading a sequence of JSON values from the same input stream; perhaps more interestingly, it has some minimal support for sequentially reading JSON items inside an outer JSON wrapper.
encoding/binary
encoding/binary
contains functions Read
and Write
which can be
used to parse and generate binary data in an externally-imposed
format. They’re slow because they use reflection (“Python-class
slow,” as Tommi Virtanen explained to me). This provides an easy way
to do what the Perl pack
and unpack
functions do, or struct.pack
and struct.unpack
from Python.
image
This module and its submodules provide support for encoding and
decoding PNG, GIF (including animation), and JPEG files. You
generally will load the images by calling image.Decode
without
explicitly mentioning the specific image format module, except to
import it, but to save e.g. a PNG you need to call png.Encode
.
The image.Image
interface looks impossibly inefficient, requiring an
indirect function call and some coordinate arithmetic for every pixel.
It looks like the PNG loading is eager and ahead-of-time, storing the
image in RAM, although image.Image
doesn’t strictly require that.
The image
module also contains some basic 2-D graphics stuff, like
image.Rectangle
(bounding boxes) and image.Point
.
C
cgo isn’t actually a Golang package; it’s a way to invoke statically-linked C libraries. But you import it as a package named "C", and you add some special comments before that import in order to link in the stuff you want.
...and then the Go compiler searches your directory for files named
*.c
, *.s
, *.S
, *.cc
, *.cpp
, or *.cxx
(but not *.C
) to
compile with the C, C++, or assembly compiler. Except that I’m not
clear which compiler it uses for this: GCC or the Plan9 C compiler?
Presumably if you want to load and invoke shared libraries written in C, this is the way to do it.
yacc
If you want to parse something more than regular languages, go tool
yacc
is probably the thing to use. It’s mostly undocumented, but
it’s similar enough to Bison, ocamlyacc, etc., that it should be
reasonably attainable.
I’ve also taken some notes on some other packages that seem less central to me.
html/template
html/template
gives you convenient XSS-safe HTML templating. Its
capabilities are fairly elaborate, and there's also a text/template
package for doing the same kind of thing without the HTML escaping.
syscall
This deprecated package provides access to basically the entire Linux
(or other) native system call interface, including weird things like
epoll(7), ETH_P_IRDA
, inotify(7), SCM_RIGHTS
, and openat(2). The
unsafe
package has some details on directly invoking Linux system
calls.
This is also the package where you get errno values, if you want to
test those against the Err
attribute of e.g. an os.PathError
.
encoding/xml
encoding/xml
contains an XML parser and emitter, some HTML parsing,
and something that looks suspiciously like another generic
serialization system like Gob and the one in encoding/json
, but
really isn’t.
The serialization system handles arrays, slices, structs, and interfaces, but not maps, and like the JSON serializer, it optionally uses struct field tags. It seems to be designed to allow you to produce arbitrary XML by marshalling structs, and parse almost arbitrary XML by unmarshalling.
This module doesn’t provide DOM or SAX interfaces, but it has interfaces that are sort of similar.
net
This library provides sockets, basically, but with a somewhat nicer
interface. Like, actually a dramatically nicer interface. TCP
sockets have SetNoDelay and SetLinger methods, for example, and you
can invoke net.Dial("tcp", "google.com:http")
to open a connection.
index/suffixarray
suffixarray
provides an in-RAM search index that supports regular
expression searches, though using a log-linear-time construction
algorithm rather than a linear-time one. On my laptop, it takes about
1.8 seconds to index a megabyte, about 26 seconds to index 10
megabytes, and 540 seconds to index 100 megabytes; then, each match
for a simple regexp in the 10MB index takes 1.3 ms, and each match in
the 100MB index takes 2.5 ms. It uses 1.6 GB of RAM to index 100
megabytes, which is maybe a bit excessive.
You can write the index to a file, which is about 4.8 bytes for each byte that you originally indexed. My laptop can then read the index in at about 100 megabytes a second, which is a necessary prerequisite to doing searches with it.
As a point of comparison, I tried index/suffixarray
, the raw regexp
engine, and the Python regexp engine to find the '^From ' lines
delimiting messages in a 100MB mbox. The indexed search found the
3593 From_
lines in 17.5 seconds. Just reading the file in and
running the regexp engine over it found them in 12.6 seconds. The
Python version took 2.0 seconds.
This seemed like unreasonably poor performance, so I tried searching for a spammer’s email address. The Python version found the three matches in 430 ms; the Golang brute-force version found them in 210 ms; and the indexed version found them in 4.3 seconds, of which 4.0 seconds were spent reading the index.
Just for kicks, I tried a Perl version. It produced the 3593 From_
lines in 220 ms and the three hits for the spammer’s address in
195 ms, but upon trying to find all the lines containing the address
(regexp ^.*VanceE.McCray.*
), I killed it after 21 minutes. Python
managed to finish that job in 2.7 seconds, the Golang brute force
version in 36 seconds, and the Golang indexed version in 45 seconds.
So I haven’t been able to find any cases where index/suffixarray
is
faster than doing a brute-force regexp search on the file data in RAM,
even though it uses several times as much RAM.
runtime
runtime
provides control over the garbage collector (including
setting finalizers), a Caller
function to walk the stack, and a
bunch of profiling stuff I don’t know how to use yet. (I guess I
could call the runtime.debug.PrintStack
function periodically and
save the results, and that would be a crude profiler.)
net/rpc
net/rpc
is an RPC system using the encoding/gob
serialization; a
JSON-serialized variant is in net/rpc/jsonrpc
. It doesn’t enjoy a
lot of syntactic sugar.
zip
zip
lets you read and write zipfiles, but only items compressed with
the “store” and “deflate” methods. It uses the same io.Reader
and
io.Writer
interfaces everything else does (or rather io.ReadCloser
and io.Writer
.)