ARX models is a common class of models of dynamical systems. Here, we consider the case when the innovation process is not well described by Gaussian noise and instead propose to model the driving noise as Student's t distributed. The t distribution is more heavy tailed than the Gaussian distribution, which provides an increased robustness to data anomalies, such as outliers and missing observations. We use a Bayesian setting and design the models to also include an automatic order determination. Basically, this means that we infer knowledge about the posterior distribution of the model order from data. We consider two related models, one with a parametric model order and one with a sparseness prior on the ARX coefficients. We derive Markov chain Monte Carlo samplers to perform inference in these models. Finally, we provide three numerical illustrations with both simulated data and real EEG data to evaluate the proposed methods.