Intel’s Atom C2000 processor family has a fault that effectively bricks devices, costing the company a significant amount of money to correct. But the semiconductor giant won’t disclose precisely how many chips are affected nor which products are at risk.
On its Q4 2016 earnings call earlier this month, chief financial officer Robert Swan said a product issue limited profitability during the quarter, forcing the biz to set aside a pot of cash to deal with the problem.
“We were observing a product quality issue in the fourth quarter with slightly higher expected failure rates under certain use and time constraints, and we established a reserve to deal with that,” he said. “We think we have it relatively well-bounded with a minor design fix that we’re working with our clients to resolve.”
Coincidentally, Cisco last week issued an advisory warning that several of its routing, optical networking, security and switch products sold prior to November 16, 2016 contain a faulty clock component that is likely to fail at an accelerated rate after 18 months of operation.
Cisco at the time declined to name the supplier of that component. When asked on Monday whether Intel supplied the faulty electronics, a Cisco spokesperson told that The Register that the networking giant does not intend to publicly name the supplier.
Intel indicated in a January 2017 revision of its Atom C2000 family documentation that the chip line contains a clock flaw. Errata note AVR.54, titled “System May Experience Inability to Boot or May Cease Operation,” explains that the Atom C2000 Low Pin Count bus clock outputs (LPC_CLKOUT0 and LPC_CLKOUT1) may stop functioning. Permanently.
An Intel spokesperson in an email to The Register characterized the issue as “a degradation of a circuit element under high use conditions at a rate higher than Intel’s quality goals after multiple years of service.”
“If the LPC clock(s) stop functioning the system will no longer be able to boot,” Intel’s documentation explains.
This consequence is precisely what Cisco says may happen to its devices given enough time. “Once the component has failed, the system will stop functioning, will not boot, and is not recoverable,” Cisco’s advisory states.
The Register asked Intel whether it could confirm that Cisco’s advisory could be attributed to an Intel component. Intel said it could not confirm or deny whether its chip issue is the one affecting Cisco gear, citing a policy of not commenting on customers. We note that the affected Cisco ASA 55xx products use Intel’s Atom C2000 system-on-chips at least.
We asked Intel to provide specific details about when it began and stopped shipping Intel Atom C2000 processors with faulty clock outputs. Intel declined to comment. The official errata says the B0 stepping of C2xxx Atoms are vulnerable to failure, and these parts began shipping in 2013. The specific SKUs are:
C2308, C2338, C2350, C2358, C2508, C2518, C2530, C2538, C2550, C2558, C2718, C2730, C2738, C2750, and C2758.
We asked Intel how many affected Atom C2000 chips have been shipped and how much fixing the issue will cost the company. Intel declined to comment.
Intel did, however, provide some insight on how the Atom C2000 flaw might be addressed. “A board level workaround exists for the existing production stepping of the product which resolves the issue,” a company spokesperson said in an email. “Additionally, Intel will implement and validate a minor silicon fix in a new product stepping that resolves this issue.”
Many other technology vendors make products with Intel Atom C2000 processors, including Dell and Synology. The Register pinged Dell via email, and it was not immediately available for comment.
People with Synology DS1815+ storage boxes have been reporting complete hardware failures; the DS1815+ is powered by an Intel Atom C2538.
Other vendors using Atom C2000 chips include Asrock, Aaeon, HP, Infortrend, Lanner, NEC, Newisys, Netgate, Netgear, Quanta, Supermicro, and ZNYX Networks. The chipset is aimed at networking devices, storage systems, and microserver workloads. If you know of any affected or failed gear, please let us know.
According to this Intel data-sheet [PDF], LPC_CLKOUT0 and LPC_CLKOUT1 are driven by the processor to provide essential timing signals to hardware on the board, including the boot ROM. If these signals stop ticking, the rest of the electronics stops, too. ®