This post was originally published on the Cooper Hewitt Labs blog.
We’ve been undergoing a massive rapid-capture digitization project here at the Cooper Hewitt, which means every day brings us pictures of things that probably haven’t been seen for a very, very long time.
As an initial way to view all these new images of objects, I added “date last photographed” to our search index and allowed it to be sorted by on the search results page.
That’s when I found this.
I hope we can all agree that this pony is adorable and that if there is anything else like it in our collection, it needs to be seen right now. I started browsing around the other recently photographed objects and began to notice more animal figurines:
As serendipitous as it was that I came across this wonderful collection-within-a-collection by browsing through recently-photographed objects, what if someone is specifically looking for this group? The whole process shows off some of the work we did last summer switching our search backend over to Elasticsearch (which I recently presented at Museums and the Web). We wanted to make it easier to add new things so we could provide users (and ourselves) with as many “ways in” to the collection as possible, as it’s those entry points that allow for more emergent groupings to be uncovered. This is great for somebody who is casually spending time scrolling through pictures, but a user who wants to browse is different from a user who wants to search. Once we uncover a connected group of objects, what can we do to make it easier to find in the future?
Enter synonyms. Synonyms, as you might have guessed, are a text analysis technique we can use in our search engine to relate words together. In our case, I wanted to relate a bunch of animal names to the word “animal,” so that anyone searching for terms like “animals” or “animal figurines” would see all these great little friends. Like this bear.
The actual rule (generated with the help of Wikipedia’s list of animal names) is this:
"animal =>; aardvark, albatross, alligator, alpaca, ant, anteater, antelope, ape, armadillo, baboon, badger, barracuda, bat, bear, beaver, bee, bird, bison, boar, butterfly, camel, capybara, caribou, cassowary, cat, kitten, caterpillar, calf, bull, cheetah, chicken, rooster, chimpanzee, chinchilla, chough, clam, cobra, cockroach, cod, cormorant, coyote, puppy, crab, crocodile, crow, curlew, deer, dinosaur, dog, puppy, salmon, dolphin, donkey, dotterel, dove, dragonfly, duck, poultry, dugong, dunlin, eagle, echidna, eel, elephant, seal, elk, emu, falcon, ferret, finch, fish, flamingo, fly, fox, frog, gaur, gazelle, gerbil, panda, giraffe, gnat, goat, sheep, goose, poultry, goldfish, gorilla, blackback, goshawk, grasshopper, grouse, guanaco, fowl, poultry, guinea, pig, gull, hamster, hare, hawk, goshawk, sparrowhawk, hedgehog, heron, herring, hippopotamus, hornet, swarm, horse, foal, filly, mare, pig, human, hummingbird, hyena, ibex, ibis, jackal, jaguar, jellyfish, planula, polyp, scyphozoa, kangaroo, kingfisher, koala, dragon, kookabura, kouprey, kudu, lapwing, lark, lemur, leopard, lion, llama, lobster, locust, loris, louse, lyrebird, magpie, mallard, manatee, mandrill, mantis, marten, meerkat, mink, mongoose, monkey, moose, venison, mouse, mosquito, mule, narwhal, newt, nightingale, octopus, okapi, opossum, oryx, ostrich, otter, owl, oyster, parrot, panda, partridge, peafowl, poultry, pelican, penguin, pheasant, pigeon, bear, pony, porcupine, porpoise, quail, quelea, quetzal, rabbit, raccoon, rat, raven, deer, panda, reindeer, rhinoceros, salamander, salmon, sandpiper, sardine, scorpion, lion, sea urchin, seahorse, shark, sheep, hoggett, shrew, skunk, snail, escargot, snake, sparrow, spider, spoonbill, squid, calamari, squirrel, starling, stingray, stinkbug, stork, swallow, swan, tapir, tarsier, termite, tiger, toad, trout, poultry, turtle, vulture, wallaby, walrus, wasp, buffalo, carabeef, weasel, whale, wildcat, wolf, wolverine, wombat, woodcock, woodpecker, worm, wren, yak, zebra"
Where every word to the right of the => automatically gets added to a search for a word to the left.
Not only does our new search stack provide us with a useful way to discover emergent relationships, but it makes it easy for us to “seal them in,” allowing multiple types of user to get the most from our collections site.