In October 1991, Unicode 1.0 was first released. In the 30 years since that publication an entire generation of language workers have been educated and started work, never having had to know the “joys” of trying to ungarble text that had gone through multiple encodings. The introduction of Unicode has simplified life for many of us and allowed millions and millions of people to access digital resources in their own languages.
Read More
Mention “interoperability” and many localizers think of yet another conference panel about the value of XLIFF, or why they should care about Translation Memory eXchange (TMX), or the arcana of ISO technical committees. The reduction of the topic to technical standards is understandable given the focus these topics have enjoyed over the past two decades since the release of TMX in 1998. However, CSA Research’s examination of the topic has revealed that interoperability is a much bigger issue w...
Read More
Localization industry veterans may recall when the OSCAR standards group in the now-defunct Localization Industry Standards Association introduced TermBase eXchange (TBX) way back in 2002, based on earlier work from 1999. Released in the early days of XML, it promised to be a major step forward for making terminological data useful. After it was adopted as an international standard (ISO 30042) in 2008, it seemed that it had reached maturity and a firm place as a star among language industry stan...
Read More
The Holy Grail of the language industry has been to standardize the transfer of jobs between the various tools and content management systems – and thus improve the outcomes. Linport, the latest initiative in this area, was born as the Container Project in 2011 at the final meeting of the Localization Industry Standards Initiative (LISA). Despite early promise, Linport has yet to make major inroads into the language industry. Other prospective standards, such as Translation Web Services from OA...
Read More
The history of standards for data and file exchange formats in the language industry goes back to the Localization Industry Standards Association (LISA) in the 1990s, which spearheaded the efforts around TBX, TMX, and GMX. The Organization for the Advancement of Structured Information Standards (OASIS) organized the DITA, ebXML, XLIFF, and many other business data exchange standards. Linport is yet another initiative for localization data exchange. Most recently, GALA has been coordinating a ne...
Read More
In the 1980s, the American rock band Van Halen became famous for including a requirement in contracts with concert venues that they provide a bowl of M&Ms candy with all of the brown ones removed. At the time, this was widely seen as an example of how out of touch rock musicians were with reality, but it actually served a purpose. The band’s manager explained that if venues took care of the small details, he could be reasonably certain that they had also addressed more important things. However...
Read More
Intelligent or smart content has been a dream since the late 1990s. The concept refers to text, data, and audio-visual materials that contains machine-interpretable information describing its structure, giving some guidance as to its meaning, and defining its relationship to other content. Various technologies have tried to deliver on the promise of content that machines can act upon. Today some approaches are beginning to bear fruit, but significant hurdles remain in the base technologies and t...
Read More
Last month the World Wide Web Consortium (W3C) announced its new Internationalization Initiative as a way to boost its long-running activity in this area. CSA Research spoke with Richard Ishida, who leads these efforts, to learn more about its plans and what they mean for the language industry. He described an ambitious effort to identify – and resolve – technological barriers that keep the web from living up to the world wide part of its name. However, the success of this effort will rely on ...
Read More
User-generated content (UGC) has garnered a lot of attention due to the challenges it poses for localization, such as an abundance of spelling errors, the extent to which its meaning depends on context, a lack of consistency, and time sensitivity. But even as enterprises and language service providers (LSPs) struggle to deal with it, another type of generated content has been quietly swelling into a looming tsunami: machine-generated content (MGC). Today, increasing quantities of content appear ...
Read More