We consider the problem of learning generalized policies forclassical planning domains using graph neural networks fromsmall instances represented in lifted STRIPS. The problemhas been considered before but the proposed neural architec-tures are complex and the results are often mixed. In thiswork, we use a simple and general GNN architecture andaim at obtaining crisp experimental results and a deeper un-derstanding: either the policy greedy in the learned valuefunction achieves close to 100% generalization over instanceslarger than those used in training, or the failure must be under-stood, and possibly fixed, logically. For this, we exploit therelation established between the expressive power of GNNsand the C2 fragment of first-order logic (namely, FOL with2 variables and counting quantifiers). We find for examplethat domains with general policies that require more expres-sive features can be solved with GNNs once the states are ex-tended with suitable ”derived atoms” encoding role composi-tions and transitive closures that do not fit into C2. The workfollows an existing approach based on GNNs for learning op-timal general policies in a supervised fashion, but the learnedpolicies are no longer required to be optimal (which expandsthe scope, as many planning domains do not have general op-timal policies) and are learned without supervision. Interest-ingly, value-based reinforcement learning methods that aimto produce optimal policies, do not always yield policies thatgeneralize, as the goals of optimality and generality are inconflict in domains where optimal planning is NP-hard.