in the summer of ninety i was a twenty one year old astronomy undergrad. in recent months, the berlin wall had fallen, and the hubble space telescope had finally reached orbit. i was living in a one room apartment by the back patio of a historic house, just north of campus. my landlady, martha ann zively, an eighty three year old local legend, lived overhead. rent was a hundred and fifty dollars per month. mobile phones, laptops, and the web were still over the horizon. twin peaks would appear in a few months, and the austin garage rock scene was again building steam.
the previous fall i’d become a research assistant with the hubble space telescope astrometry group. the group had members from the astronomy department, mcdonald observatory, the aerospace department, and the center for space research. my supervisor, paul hemenway, was a research scientist associated with all of these organizations. paul explained the group’s projects, and where i might be able to help. hubble was designed for exact and stable pointing, since angular resolution is sensitive to motion smear. in a sophisticated, and still unique, approach, three optical interferometers were mounted on robotic arms in the hubble focal plane. these fine guidance sensors would provide feedback to the pointing control system. to evaluate and calibrate this system, exact reference points were needed.
we’d use asteroids, as part of the texas minor planet project. with help from group members in the center for space research and the aerospace department, we could refine our knowledge of asteroid orbits to the point where their predicted positions and motions were more exact than hubble observations, and could be used as references. our primary tools would be a digital equipment corporation vax vms cluster at csr, the astronomy department vax running unix, various unix workstations, and the eighty two inch telescope at mcdonald. it could directly image an asteroid on a glass photographic plate, given enough time. for the long exposures needed to gather enough light from these dim objects, we depended on a data general nova minicomputer and a suitcase sized camera.
this was the cassegrain camera, another sophisticated piece of custom, analog eighties hardware. for exact guiding it used an image dissector, a photomultiplier tube attached to the side of the camera. it looked more like an engine part, an oil filter or an ignition distributor, than a science instrument. its circular field of view was divided into four quadrants, and a guide star in the camera focal plane was manually positioned at the center. then the closed loop control system was activated, and every second, with a loud mechanical click, the system would adjust the camera position and try to keep the guide star at the center. on a phosphorescent screen, the fuzzy image of the star bounced randomly about, now nearer the center, now further out.
here’s a walk through of observing with the camera, to give a feel for the era at the end of the eighties. an observing night began in the early evening, before sunset. down in the control room, which curved in a ring below the main floor of the telescope dome, the minicomputer and its control programs had to be started. from the control room one could step out onto a catwalk running in a circle around the dome, for a spectacular view of the shadows growing out from the mountains and across the high desert, and the lofty blue sky. various obsolete minicomputers and odd bits of hardware crowded the control room with enough pieces to make period sets for more than one sixties scifi movie. the curving walls and raised computer flooring added to the overall feel of an abandoned cold war bunker. on the computer consoles, red leds showed the astronomical time and the telescope pointing, and command line cursors blinked on a few terminals scattered about the room.
the glass photographic plates for the night were waiting on a desk, and as it grew dark we prepared for the first exposure. we first used a command line program on the nova to generate telescope pointing information for the asteroids we were interested in on this particular night. using this nova may have been my last contact with old school big eight inch floppy disks. after jotting down notes for our planned observations, we took the plates up into the dome, where it was pitch black except for clouds of stars in the open slit. the telescope loomed overhead in the darkness, and we carefully climbed the stairs up onto the circular telescope floor, which hydraulically rose and sank in order to stay near the camera as the telescope and dome moved ponderously toward widely separated points on the sky. one could easily walk off of the edge of the telescope floor in the dark, when it happened to be high above the dome floor.
using the bulky hand controller hanging from a hook on the telescope, we positioned the floor so that the camera was at eye level. it was a boxy instrument of white enameled metal, with the image dissector tube protruding at the side. sliding out the flat cover over the bottom opened a rectangular frame of stars large enough to place one’s head inside, with the silhouette of the telescope secondary mirror housing and its spidery support struts high above. here inside the body of the camera was the focal plane of the telescope optics, where the photographic plate was fastened for exposure.
at the beginning of an observing night, we first had to check and calibrate the instruments, particularly on the first night of an observing run. observatory staff had mounted the camera to the telescope, and connected power and data lines, but fine calibrations were always needed. most importantly, the telescope had to be focused to adjust for thermal and structural changes. this meant adjusting the position of the secondary mirror within its housing, high overhead. a rocker switch on the hand controller activated a noisy motor to move the secondary inward or outward relative to the camera. the exact determination of focus was done old school style, using the knife edge test.
in the exact focal plane of the telescope, all of the light from a star converges through a single point. when a knife edge cuts through that point, the light from the star vanishes instantly. if the knife edge dimmed the star gradually, then the secondary mirror position needed to be adjusted. we wanted the point of pure, instant cutoff to be exactly where our photographic plates were held by the camera. there was a special metal frame with the shape and size of a photographic plate that we fastened into the camera, holding a straight metal edge. we placed one eye just behind the edge and then carefully watched it cut off the light from stars as we adjusted the secondary mirror. if there was a bit of spare time, the knife edge frame could be replaced with another metal frame holding the eyepiece, a heavy glass lens that required both hands to lift. peering inside, one saw a colorful and mysterious world of nebulas and galaxies.
once the telescope was ready, we could prepare the camera. with the high angular resolution and small field of view of the telescope, the apparent motion of asteroids relative to the sky was significant over an interval of around ten minutes. each asteroid was a bit different, and various orbit characteristics had to be taken into account. the relative direction and rate were computed, and the camera body was rotated in its mounting and programmed to move at the appropriate rate so that the asteroid would appear to be motionless.
at the end of the image dissector was a screen similar to an oscilloscope and divided into four quadrants. light from a sufficiently bright star, cascading down through the photomultiplier tube, formed a dim blob on the screen. we found a good star and centered it in the screen. when the tracking control loop was activated, it adjusted the camera position once per second, with a mechanical click, to keep the star at the center of the screen.
an exposure began with stellar tracking. then the steady clicking of the control loop would go silent for a period. the sky would turn while the asteroid built up its own small gaussian peak in the photographic emulsion. then the clicking would return. the result was a dumbbell shape for stars, with two circular peaks connected by a trail. the asteroid was a trail with a single circular peak at its midpoint. these peaks and trails became visible the next day when we developed the plates. each had many dumbbell shaped stellar trails, short or long, thick or thin, and at the center a single ufo shaped asteroid.
the next steps were to extract information from an exposure and improve knowledge of an asteroid orbit. this took place back in austin, where the center for space research and department of aerospace engineering become involved. their expertise in orbit determination played an important role in the hubble astrometry group. the space age was roughly thirty years old and the center for space research represented the first generation, with ray duncombe, byron tapley, and bob schutz involved from the beginning.
before numerical orbit analysis could proceed, the plates had to be measured using an old school scanner and minicomputer in the back room of the astronomy department library, on the thirteenth floor of robert lee moore hall. that was my primary job and i spent many hours in the scanning room. it was a meditative place. much of the room was dark except for computer lights, and the electronics fans made a steady drone. the long back wall was covered with cabinets containing thousands of glass plates, including historic full sky sets of survey plates from palomar and the european southern observatory, and many more small plates from mcdonald. black plastic sheets divided off the back end of the room to keep out stray light, and alone at the center of this cave sat the pds microdensitometer.
this was a measuring engine for mechanically scanning photographs using analog, pre-silicon, pre-laser technology. it was an interesting time capsule of sorts, involving dead end technologies that had probably been driven forward by cold war priorities. at its heart was a vertical beam of light. today this would be a laser, but in the pds light from a special bulb at the top was focused into a beam downward through a photograph mounted on a mechanically driven stage with analog position encoders. a photometer below the stage measured the intensity of the light while the stage moved in a raster pattern. discrete sampling of the photometer and encoders by the minicomputer created a digital result.
my first observing run at mcdonald was in the early summer of ninety, probably may or june. professional chat in the astronomers temporary quarters was about the problems with hubble that were being discovered immediately after its launch and repeatedly making headline news. i remember clearly that there was lots of discussion of the high gain antennas, because news of the catastrophic problem with the main mirror had not yet leaked out. overhearing the veterans during those days at mcdonald was an early revelation about the realities of high tech. paul and i had made the day long drive to west texas. we spent three or four nights making plates with the eighty two inch, and then made the day long drive back to austin. the texas heat was just beginning to get really intense, and a few days after our return i made the sweltering walk over to rlm and happily settled into the cool darkness of the scanning room. my little apartment was already uncomfortably warm during the day, even with the air conditioning running.
mcdonald plates were slightly smaller than a normal sheet of paper, and the glass was fairly thin. they felt fragile enough that taking extra care was natural. held up against a background light, the star and asteroid trails were small dark smudges. with the plate secured to the scanning stage and looking right along the top surface, one could see the dull black trails of photographic emulsion on the surface of the glass. the controller had to be told what areas on the plate to scan, and preparation involved adjusting the stage to move the scanning beam about the plate and noting the coordinates.
at the top of the pds, roughly at eye level, was a circular glass screen showing a magnified image down the scanning beam through the plate and stage. individual grains of photographic emulsion were easily visible on the screen, and when the beam was by a star trail it appeared as a fuzzy black worm. this was old school optics, essentially a microscope projecting directly onto the screen. the stage was adjusted using two finely geared knobs and the coordinates of the scanning beam were shown by two sets of red leds on the pds console, below the stage. the corners of a rectangle about a star trail were the coordinates for a raster scan, and were entered in manually at the keyboard.
the controller was a tall rack with electronics mounted in cases standing in the back corner. on a table beside the rack was a heavy monitor. it had one of the first primitive unix graphical interface systems i got to know well, and already had the slightly antiquated feel of an earlier era. a scanning session meant creating a set of digitized raster images, ultimately archived on old school nine track half inch tape, one file for each trail scanned by the pds. a group of files, say thirty to fifty for a plate with a good exposure and lots of stars, was created in the filesystem of the control computer and then written to tape using its sibling above on the sixteenth floor, which had the tape drive. the shift over the border from analog to digital took place in the seventies style electronics connecting the pds to the controller.
a few days after scanning that first plate, i went to meet with paul and ray in the aerospace building. i can clearly remember stopping in the texas sun. overhead was the typical hard blue summer sky with little white clouds, and i was already sweating just seconds after stepping outside. exactly which stars were on that plate? how could we identify them in order to determine the position of the asteroid? was there a program on the astronomy or aerospace computers to do this? the answer was no, there was not an easy or obvious solution. helping to figure out a practical method for our particular plates was part of my job, not that an undergrad was expected to solve the problem, but at least to get a feel for the questions involved. how did one recognize stars? humans could do it, but could an eighties computer system?
thirteen years later, i entered the aerospace graduate program and went to work at the center for space research. bob schutz was my boss for the next eleven years. my job concerned star trackers, descendents of old school celestial navigation sextants. once again i was dealing with images containing a scattering of unknown stars. within aerospace it’s a classic and has its own name, the lost in space problem. given an image of some stars, what are we looking at? aerospace has its own perspectives, cultural bents, and tools. astronomers didn’t generally think in terms of three dimensional unit vectors, rotation matrices, quaternions, and vector matrix notation, and it was soon apparent that the concerns and methods in aerospace were much more widely applicable than those in astronomy, bringing together optimization, control, data fusion, high performance computing, and machine learning.
within a few weeks of beginning, star identification was again one of my major concerns, and once again the first question was whether a practical solution was available. i checked back with people in the astronomy department, after being out of touch for nine years or so. pete shelus from the hubble astrometry days was a member of our group, and pointed me in the right directions. there was a strong sense of continuity, that here was a problem which really needed addressing. computers were now more powerful and digital imaging was now standard. there was no longer an analog to digital divide to cross, everything was in binary format from the beginning.
with icesat, the onboard control system provided accurate attitude estimates and it was straightforward to predict which stars each image contained. this wasn’t obvious or simple at first, and i put a lot of effort into understanding the data pouring down from the spacecraft. there were four star imagers of three different types onboard, all working at ten hertz or more. these were eighties vintage star trackers and did not automatically provide star identifications, as later generations do. there was also the high frequency angular rate data from the gyro unit to consider.
for each star image, we could compute a pointing vector which greatly simplified star identification. it was usually enough to ask the question, is a star of the appropriate brightness found near the predicted position in the image? the pure star identification problem tends to become obscured when brightness is introduced, not least because it’s difficult to measure or predict spectral response and effective brightness for real imagers. an image usually has better spatial information than brightness information. an astronomer interested in brightness does photometry with dedicated sensors, not standard imagers.
an additional check was that the angles between observed star pairs matched predictions. star pairs moved in the direction of star triangles, and became the basis for modelling the image distortions. one of my first jobs was to model the distortions and other errors of the icesat star trackers. at this point i really began exploring the research literature on star identification and related topics, and discovered a self contained little intellectual world.
its roots go back to old school celestial navigation. the technology evolved from the age of sail, and in the second world war many large aircraft had a bubble window on top for a navigator to make stellar observations. after the war, computing and imaging meant increasing automation. the cold war created an enormous driving force behind the technology. many people became vaguely and uneasily aware of guidance systems. they had a certain aura of spy movies and armageddon. while computers and inertial guidance were where most of the funding ended up, automated star tracking quietly matured in parallel. star trackers are critical for spacecraft, and are still used on certain high altitude aircraft. the classical period, when the fundamental concepts were sketched out, was the seventies and eighties, the high cold war era, with many physical scientists cut loose by the end of the sixties bubble.
it soon became clear that there were still no public solutions for the star identification problem. it seemed that each time star identification software had been developed, it had been for a government or commercial project. if you were serious about star identification, you probably wanted to sell star trackers. that’s a mature industry now, with plenty of sellers and not a lot of buyers. there’s no motivation to think that way anymore. eventually i’d meet with the university intellectual property office concerning open source licensing of the software we were developing.
another thirteen years had passed. excitement was growing about advances in deep neural networks, especially at google, which had just open sourced tensorflow. for a number of reasons, it was clearly time to tackle the problem directly, using both rules based and machine learning methods in parallel.
the concept was to start from scratch as a github open source project, and to integrate tensorflow from the beginning. this meant working in c++ and python numpy. the only external input was to be a list of star positions, for which nasa’s skymap star catalog was ideal. skymap was created in the nineties for use with star trackers and we’d used it extensively for icesat, even collaborating with its creators. when hubble was launched, one its early problems was bad guide stars. as part of the overall hubble recovery effort, nasa funded the efforts resulting in the skymap catalog. it’s purpose is to describe the sky as seen by the standard star trackers of the nineties, and more generally by typical digital imagers.
viewing skymap as simply a list of star positions, how does one generate a star image? the root of the problem is, for a list of points on a sphere, which points are near a particular point. for example, given a list of positions across the surface of the earth, which are near a particular latitude and longitude. the usual answer involves dividing the sphere up into tiles, or transforming it into a square and then subdividing. the square sky is not unheard of. we used a more dynamic and flexible approach based on the work of daniele mortari. it’s closely related to lookup and hash tables, but has some unique and interesting quirks.
the key is to view stars and points on the sky as unit vectors, with three components between minus one and plus one. we’re searching for stars within small ranges of each component. picture narrow rings on the sky about each axes, and finding the stars inside the small region where the rings intersect. three dimensional search reduces to three separate one dimensional searches for small ranges of values, followed by an intersection operation on the three sets of results. each one dimensional search is performed on a separate table, with sorted numbers from minus one to one as keys and star identity integers as values. we’re calling this data structure a float int table. performance can be improved by fitting a curve to the floats and using it to calculate the low and high indexes into the table, creating a kind of ranged search hash table with the fitted curve acting as a hash function.
from the beginning, a cultural difference between machine learning and aerospace was clear. to oversimplify, machine learning wants to be about two dimensional images, while aerospace wants to be about three dimensional unit vectors. the two representations are equivalent and interchangeable. images are more practical in many contexts, unit vectors are ideal geometrically. in the end, after roughly eight months or so, the conventions and code of a machine learning image interface grew organically over the underlying aerospace unit vector geometry.
a curious sequence of coincidences took place. a standard nineties star tracker image size was eight degrees, or twenty eight thousand arcseconds on the sky, sixteen time larger than the apparent diameter of the moon. the standard machine learning problem happens to use images that are twenty eight pixels square. adopting these pixel dimensions resulted in star images with thousand arcsecond pixels. the implications are deeper than nice rounding properties, since it means low resolution, at the level of a toy camera or blurry mobile phone photo. by comparison, real star tracker images involve arcsecond resolutions.
low resolution makes the star identification problem more challenging and intellectually interesting. you’re forced to use global structures and patterns, rather than localized heuristics and tricks. there’s simply less information available and you have to be more thoughtful about using it, even suggesting questions about how the human brain solves the problem. for example, a typical high resolution aerospace algorithm might focus on the exact distance between a pair of stars, along with the angle to a third star. this is clearly not how the brain identifies stars, but what is the brain in fact doing?
another nice coincidence seems obvious once you’ve seen it, but isn’t so obvious at first. when you want to identify a particular star in an image, it helps to shift it to the center and make its presence implicit. there’s no point in showing it in the image, what’s significant is the geometry to the other stars. it becomes the origin of the coordinate system, and if there’s another star nearby, as often happens in a low resolution image, there’s no confusion. in practice, the effects are even nicer than you might think, a bit like getting an extra star for free while eliminating annoying coordinate transformations.
all the way back to ninety, it was clear that the shapes of triangles formed by a star field are unique, and can be used to identify one or all of the stars. a few minutes thought always gave the feeling that this is somehow an iterative and recursive approach. once you start thinking about triangles, they tend to multiply, which always made me uncomfortable. skipping ahead to the answer, enlightenment is to state the problem simply. start with a set of star identities and iteratively set aside those that can’t be correct, until only one remains. it’s a brute force approach and hopefully there will be time later to work from deeper insights. the main thing is, it works.
at first, things started out on a more complex path, and the resulting frustrations eventually forced a rethink and simplification. the initial concept was to focus on groups of four stars instead of just three. for a triangle of three stars, adding a fourth provides significantly more information. you go from three edges to six, two of which are shared. definitely nice in theory, but the tradeoff is significantly more complexity. to sketch some of the details, think of two adjacent triangles sharing a pair of stars. the shared pair represents a new type of constraint for which stars are possible, and we could take advantage of this to more quickly set aside impossible stars. the practical effect is we have to simultaneously work with two sets of possible stars, keeping them in agreement via a set of possible shared pairs. ideally we could then set aside impossible shared pairs until only a single pair is left. but with low resolution, this is harder than it sounds. there are too many pairs that meet low resolution constraints, a low res shared side just doesn’t provide enough unique information, it’s too ambiguous. in other words, at low resolution many of the skies triangles are similar.
between the star and triangle levels are the pairs, the fundamental structural unit. soon after the star image generation code came methods representing all pairs separated by less than eleven degrees on the sky. this was the fourth use of the float int tables described above, with three tables for the sky axes, and a fourth table for the nearly one million pairs. pairs data is far heavier than star data, and is precomputed for repeated use. for working with triangles, speed of access to pairs is the focus. the characteristic of a pair is its angular separation, and a million pairs is represented as a million angles, along with identifiers for member stars.