This post is a follow-on to Dr. Cameron Murray’s reply to my first post on migration. In my first post I published the following graph demonstrating that Australia’s population included a ‘missing million’ people who we counted as resident, but who were actually overseas at any given time.

Figure 1: The cumulative discrepancy between official NOM and the population actually in Australia has exploded

Cameron’s response was to investigate some further data from ABS 3401 to see if he could determine roughly where these people were, and what sort of travel they were engaged in. He produced four graphs which lead him to conclude that the Missing Million were quite probably substantially composed of newly retired Baby-Boomers and their families, who were likely to be on short holiday visits to places like Bali in Indonesia.

The investigation and further research is useful and welcome, as it at least attempts to engage with some of the complexity surrounding migration measurement, which I’ve lamented is universally glossed-over. I’ve also lamented after the guys from Macrobusiness seized on Cameron’s response as a ‘rebuke’, how even this narrative exposes how the details of migration could lead to completely counter-productive knee-jerk reactions, which Macrobusiness seems inclined to make. A million boomers who will come back from holiday to get sick and die in Australia will lead to a completely different policy response to a million young ‘migrants’ getting jobs and starting families here who spend a chunk of the year back in their other ‘home’. Kicking out the latter will do nothing to help ease the pressures created by the former.

However, the four graphs that Cameron has produced provide a great example of the dangers of drawing quick conclusions from data. He seizes on a narrative which relies on committing the oldest data-interpretation error in the book (confusing correlation with causation) four times over in quick succession, when a few simple techniques (like eye-balling the raw relevant data, or doing a back-of-the-envelope calculation to check orders of magnitude) show that the evidence doesn’t stack up, and possibly points elsewhere.

Why we need to talk about Kiwis

The alternative narrative that I’ll champion here, is that a larger share of the Missing Million is actually Kiwis, probably relatively young ones, who have come to Australia to work or study, but spend rather significant amount of time (in various trip-lengths) back in New Zealand. I won’t try to prove that it’s the dominant component, since there isn’t clear evidence for that either, in fact it appears that the Missing Million must comprise quite a variety of traveller types. But it fits the data better than the Boomers in Bali story since departures to New Zealand are the dominant component of short-term departures, so it’s a reasonable angle to take to refute my ‘rebuke’.

It’s also a fun narrative to choose because it contrasts so well in every sense with the Cameron’s hypothesis in a policy sense. Unlike Aussie Boomers returning from their retirement holidays, it’s quite possible that the New Zealanders will actually make a general exodus at some point, and decide that New Zealand is at least as good a place to live in their next life stage, such as raising a family or retiring. It’s also possible that they might continue to use a significant number of services (like health-care) in New Zealand on their trips back home.

Perhaps best of all, the case of the Kiwis highlights the misnomer that ‘Net Overseas Migration’ being closely linked to “Permanent Migration Program”, which small-Australia advocates like Macrobusiness continually urge the Federal Government to reduce. Unless we want to tear up our reciprocal arrangement with our New Zealand brothers over the ditch, this significant component of migration isn’t something that cutting the permanent migration program will have even the slightest impact on. Kiwis don’t need a permanent visa to live here permanently, and neither do we in New Zealand.

As we can see here from the Department of Immigration and Border Protection, there are more New Zealanders in Australia than any other type of temporary entrant. Well over 600,000.

New Zealanders make up the largest share of 'Temporary Entrants' in Australia

Figure 2: New Zealanders make up the largest share of ‘Temporary Entrants’ in Australia

However New Zealanders have reciprocal arrangements with Australia that allows them to remain ‘temporary’ permanently, and do all the things that we normally associate with permanent residents, like working, and coming and going as they please. As a result, they actually constitute a surprising slab of the official “Net Overseas Migration” intake. In fact, since 2008 when Australia’s supposedly turbo-charged migration intake really started, New Zealanders have contributed just as much as almost any other category except students.

Figure 3: NOM Breakdown

I can’t help but make further passing observation here, that all the permanent visa categories (in black boxes) during this period rarely add up much past 70,000, which is about the number that most of the small-Australia advocates advocate anyway. It’s the influx of temporary entrants that has really driven the appearance of extremely high migration during this period, which is consistent with the broad thesis I outlined in a previous post about ‘mobility’ rather than ‘migration’ having made a level-shift. Eliminating the entire “Permanent Migration Program” from 190,000 to zero won’t do anything to reduce two thirds of the migration we actually have. In fact it would probably just drive a further increase in temporary entrants, including on Bridging visas, but I’ll focus on that charge another day.

The Boomers in Bali Narrative

The image Cameron Murray created of the Missing Million relied on four characteristics in short-term departures: reason for departure (holiday), length of stay (short), destination (Indonesia), and age (older).

Sure enough, he has a chart which says the highest level of growth comes from growth in holidays:

Holiday travel seems to have risen the fastest

Figure 4: Holiday travel seems to have risen the fastest

And of course, there’s a graph which says that the fastest growth is in short-term trips as well.

Very short travel seems to have risen the fastest

Figure 5: Very short travel seems to have risen the fastest

And departures to Indonesia also seem to have risen the most sharply from the mid 2000s as well:

Indonesia rose very quickly as a destination

Figure 6: Indonesia rose very quickly as a destination

And there’s a hint as well that somewhat older people might make up a higher fraction of travellers of lately too.

Older people are travelling more in relative terms

Figure 7: Older people are travelling more in relative terms

So it would be very tempting to conclude that the same set of travellers are driving the relevant trends in all of those cases, and that those travellers also constitute a significant share of the “Missing Million”. To do so would be to essentially make a whole sweeping set of assumptions which haven’t even been discussed, let alone tested, and some can be demonstrated to be substantially untrue.

Perhaps the easiest to tackle is the last graph, whch was used to suggest that a significant number of travelers might be the old boomers. The problem with this graph is that it’s expressed as a percentage. As I’ve previously outlined, the most striking trend in overseas travel is that it’s increased, quickly and relentlessly. If the two curves shown in Figure 7 referred to numbers of movements, the 2015-16 lines would be consistently far higher (about one and a half times) as the 2005-2006 lines. Whilst a somewhat higher share recently are 60-something than previously, movements are now so high in all other categories that change that caused the discrepancy could be in any other age-group just as easily.

With regards to the other data, first and favourite technique to test a hypothesis is to just have a look at relevant data with the Mk 1 eyeball. Adding in the cumulative discrepancy between NOM and Net Movements, we can see how and when the Missing Million actually grows:

Figure 8: The cumulative missing million doesn’t match

As we can clearly see there, the discrepancy between NOM and Net Movements seemed to have really become serious in the very early 2000s, quite possibly before the marked acceleration in Short-Term Departures. It does however appear that Short-Term Departures do seem to track somewhat with the cumulative discrepancy thereafter. However, the same isn’t the case with Departures by destination:

Figure 9: Travel to Indonesia was low while most of the Missing Million left.

Here it can be see that the accumulation of the first half of the Missing Million occurred while travel to Indonesia was actually low, and not growing. Furthermore, at precisely the time of the fastest acceleration in travel to Indonesia, NOM and Net Movements came closest to aligning. In contrast, travel to New Zealand grew consistently throughout this period, not to mention being higher throughout. I’d argue that NZ looks more likely by far at this stage.

My second favourite technique is to do just to a quick of some numbers, to see whether things add-up even roughly, at least to within an order-of-magnitude. Looking at the Short-Term-Departures by duration, we can multiply each series by an appropriate number to see what sort of impact each component might actually have on people actually absent from Australia. Not having any better information, I assumed that the distribution of movement lengths in each category was reasonably flat, and chose 4, 11, 22, 46, 77, 165, and 273 as the probable average number of person-days that each departure would reduce from Australia’s Physically Present Population. I then took the previous 12 month calculation and divided by 365 to find the total number of ‘person-years’ that each set of departures could contribute at any given point, to see which contributed the most.

Figure 10: Departures of longer than one month have the largest contribution to persons absent

We can see here that the impact of trips less than two weeks is very greatly diminished. The largest contributor to the Missing Million is likely to be trips of over a month, hardly a fleeting holiday.

But more interestingly, we can see that the short-term departures of all the trips under two months clearly can’t add up to anything like a million people. I did some calculations and found that in fact at the end of the data all the short-term departures probably only cumulatively add up to under 800,000. I suspect the assumption of a flat distribution across all the categories is likely to overstate absence if anything. Furthermore, at the end of 2001, before the Missing Million had left, those departures accounted for about 400,000 people absent. So it really isn’t plausible that all these departures listed actually account for anything like the full Missing Million, or probably even half of that. The sub-set of Boomers that Cameron Murray describes is doomed to be a trivial minority in any case.

There’s also the possibility of significant a shift internally within categories. It could be the case that my estimation of a flat distribution across each time interval is false, and some change has led quite systematically to some of the categories to become increasingly skewed, probably to the left (shorter trips) since the frequency of travel has risen so much. The switching over of the 6-12 month and 3-6 month lines seems to show some evidence of that occurring, however if a shift in that direction also occurs internally within the categories, we would likely have even less of the Missing Million accounted for.

A reality-check on Data Quality

There are a few things that could explain such a result, and as always it’s probably best to go back to the source of the data to try to understand why it is possible. A significant factor is that the data presented here is based on peoples indicated intent, as per their passenger departure cards, like this one:

The departure card assumed that people know whether they are 'resident' or 'termporary entrant'.

Figure 11: The departure card assumed that people know whether they are ‘resident’ or ‘termporary entrant’.

The largest failure in these cards is asking people to self-select what type of traveller they are, most confusingly whether they are a ‘visitor or temporary entrant’ or ‘Australian resident’. Since these cards don’t come with any explanation of the 12/16 rule of ‘residence’ for migration purposes, these cards would make no sense for our increasingly part-time population, many of whom are foreign citizens, here on a temporary visa, but staying (as a Student, worker, or backpacker) for long enough to officially qualify to be ‘resident’ for migration purposes. So the entire data-series we’re working with here really is likely to be fraught with uncertainty, and doesn’t promise to consistently include those who should be counted, or exclude those who should not be.

Furthermore, there’s absolutely no obligation on the traveller to honour their ‘intent’ regarding travel. Often they may not have actually booked their onward or return flight. If I had to speculate a little as to the micro-economic dynamics that might be at work, I would think that plenty of travellers (particularly those on temporary visas in Australia) would wind up indicating something that was a poor reflection of what actually happens. With more and more people making multiple stops on their travels, with fewer of them planned in advance, they’re likely to make some educated guess about what the scary people at the customs gate are going to be most happy to hear, (including about their residency status) and just report that.

To add to this, not all the data captured in departure cards is comprehensively enumerated. According to the ABS, on average, only about 5% of the cards are selected for a sample, and most of the rest carefully imputed. Many of the methods I’m using below could well struggle if there’s even moderate errors or poor assumptions made in this sampling or imputation process. The strange results produced by our New Zealand regression below could be an indication of uncertainty or inconsistency in this process as much as anything else.

But perhaps more importantly, given the way that migration is defined under the 12/16 rule, there’s absolutely nothing preventing these ‘short term’ travels actually contributing to Net Overseas Migration, and hence not explaining a discrepancy between that number and Net Movements. Two six-month trips a couple of months apart will constitute migration. As will a two-week trip, if some was travelling a lot in the previous year.

Missing arrivals instead?

Trying not to be disheartened, we could look to see whether what data we do have could still clarify things further. So far we’ve only looked at half (or less) of the story. We have short-term arrivals as well:

Here a couple of things are striking. In particular, the under-1-week movement trend is ususally the highest, followed by 1-2 weeks. People arriving in Australia seem to report a far shorter intended stay than those leaving. And, the total number of arrivals is far far lower than the departures. This is also consistent with the hypothesis I’ve outlined earlier, that we’re a hard pace to get to for a short trip, but a good place to leave from for one.

Figure 12: Relatively long ‘Short-Term’ Arrivals have the largest contribution to persons absent

Here the plausible ‘second-half’ of the story emerges. By far the largest contributor to ‘person-years’ present in Australia from short-term arrivals is actually from the longest intended trips, of over six months, and the second-longest from 3-6 month visits. Importantly, there’s a far larger possibility that these ‘short-term’ visits will wind up being counted in Net Overseas Migration. Students or backpackers who stack up a couple of 6+ month visits inside a couple of years (probably even more likely than staying continuously) will almost certainly officially ‘migrate’. In stark contrast to the case for departures, the shorter categories are relatively tiny, and didn’t grow at all during the period when the missing million emerged. This goes a lot further towards explaining the true origins of the Missing-Million, and is further supports the hypothesis outline in my earlier post. The sorts of ‘visitors’ we get tend to be longer-term visitors, but still visitors, where as our travel outwards tends to be for faster visits.

Again, summing the total ‘person-years’ accounted for by this trend, we find there are just under 800,000 people in 2017, and just over 400,000 in 2001. Three quarters of this increase can be accounted for by movements where the stated intent longer than six months. If a large fraction of these are counted within migration, and I think it’s almost certain that they are, then the person-years present due to short term visitors would barely have moved. If a significant share of the 3-6 month intentions were also counted as ‘migration’, then the person-years present would have moved backwards due to short-term arrivals. This could explain a further part of the discrepancy between Net Overseas Migration. However, it also seems likely to not add up to be enough to reach the million.

To confirm the failing significance of passengers’ stated intentions, it looks like the ABS is also going to abandon these particular series.

An interim summary

So far, having just eyeballed some data and done a couple of quick calculations to see how the numbers stack up, it appears:

  1. The Missing Million grew to half its height while travel to Indonesia was low and flat, and travel to New Zealand was increasing.
  2. Person-year absent due to ‘short-term’ departures is driven by travel between two-weeks and two months. Very short holidays don’t contribute much. But all this data on stated intent can’t account for the Missing Million.
  3. Person-years present due to arrivals is driven by long stays of 3+ months, and overwhelmingly by 6+ months. If a large slice of these are counted in ‘Net Overseas Migration’ movements, the loss of arrivals could account for another part of the Missing Million.
  4. The data is fraught with uncertainty in any case, and all our work here should be taken with a grain of salt.

Unpacking correlation

What we haven’t yet tackled seriously is the assumption of a high-overlap between the four different trends, or at least the three which we have data on. To establish or refute this, it’s better that we look at the information available to us, in its raw (monthly) format. As we can see, the intense seasonality present in some of these time-series can provide us with a key to possible analytic mechanism that could investigate the plausible degree of overlap.