Transition-based parsing is a widely used approach for dependency parsing that combines high efficiency with expressive feature models. Many different transition systems have been proposed, often formalized in slightly different frameworks. In this article, we show that a large number of the known systems for projective dependency parsing can be viewed as variants of the same stack-based system with a small set of elementary transitions that can be composed into complex transitions and restricted in different ways. We call these systems divisible transition systems and prove a number of theoretical results about their expressivity and complexity. In particular, we characterize an important subclass called efficient divisible transition systems that parse planar dependency graphs in linear time. We go on to show, first, how this system can be restricted to capture exactly the set of planar dependency trees and, secondly, how the system can be generalized to k-planar trees by making use of multiple stacks. Using the first known efficient test for k-planarity, we investigate the coverage of k-planar trees in available dependency treebanks and find a very good fit for 2-planar trees. We end with an experimental evaluation showing that our 2-planar parser gives significant improvements in parsing accuracy over the corresponding 1-planar and projective parsers for data sets with non-projective dependency trees and performs on a par with the widely used arc-eager pseudo-projective parser.