Apple researchers: LLMs don’t do formal reasoning

Big new article from several AI researchers at Apple that’s been making the rounds that concludes that large language models (LLMs) don’t do formal reasoning and can be easily distracted by minor irrelevant information. Pretty dense 20+ page paper but Gary Marcus has an excellent but still in-depth summary on his site.

Marcus’ key takeaway from Apple’s AI research:

“We found no evidence of formal reasoning in language models… Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%!”

Based on both his expertise and Apple’s findings, Marcus makes a definitive statement about LLM’s reasoning capabilities:

There is just no way can you build reliable agents on this foundation, where changing a word or two in irrelevant ways or adding a few bit of irrelevant info can give you a different answer.

These findings align with my experience. While getting high-quality, accurate responses from LLMs is possible, it often requires careful prompting and iteration. LLMs are excellent tools that I use every day and am actively helping build AI products at the day job, but like most tools they have their limitations. What’s particularly noteworthy to me is that these same limitations were documented back in 2019, and while LLMs have made remarkable progress in many areas, their fundamental reasoning capabilities haven’t improved at nearly the same pace.

So what does all this mean? Does it mean AI tools are dead? Not at all. I am a big proponent of human-in-the-loop AI solutions that leverage the strengths of AI and iteratively improve with human review and intervention. With human oversight, model monitoring, and great AI product designers (of course), we can build powerful AI tools that help us do the work we all need to get done everyday even if those tools can’t do formal reasoning.

How This Texas Town Became One of America’s Fastest-Growing Cities

Schlitterbahn

New Braunfels grew and changed a ton while I grew up there until I left for college at 18. Both of those things have only increased in the 20 years since.

The New York Times did a profile on why New Braunfels is one of America’s fastest-growing cities and talks of course about it’s German roots, Schlitterbahn, and Gruene Hall but also how being situated directly between Austin and San Antonio has resulted in explosive growth and change.

(In a sign that the small town roots are still there, there are also several quotes from my middle school principal and friend’s dad who is now the current mayor.)

Google SSL Search Will Block Search Referrers

Your website stats are getting a little less valuable thanks to new search feature Google is rolling out to its users. SSL encrypted search will now automatically be turned on for all logged in users, resulting in improved security and privacy for web searches.

This improved privacy will result in less available data for site owners in analytics tools, including Google Analytics. Specifically, for users with SSL search enabled it will no longer pass along the search keywords that brought them to your website. From Google’s blog post announcing the change:

What does this mean for sites that receive clicks from Google search results? When you search from https://www.google.com, websites you visit from our organic search listings will still know that you came from Google, but won’t receive information about each individual query. They can also receive an aggregated list of the top 1,000 search queries that drove traffic to their site for each of the past 30 days through Google Webmaster Tools.

Continue reading “Google SSL Search Will Block Search Referrers”

Think Apps are All That Matters for Mobile? Think Again.

The buzz out there now is all about apps. Whether it’s for iPhone or Android, you can’t visit Yahoo.com or watch the local news without hearing all about the latest, greatest mobile app. Dedicated apps that users install from app stores have a lot of advantages and there are some out there we all can’t live without, but you can’t ignore the mobile web.

Two apps that are on most every smartphone user’s home screen are Twitter and Facebook. They’re two of the best designed apps out there and both have been in the top 25 free apps in the iTunes App Store consistently since launch. Knowing that you might think Facebook and Twitter users always take advantage of those great apps when they’re on the go, but the numbers say otherwise. Based on numbers aggregated by Luke Wroblewski, mobile usage of the top two social juggernauts continues to explode and much of it is on the mobile version of their websites:

  • 50% of the more than 500 million active Facebook users currently access Facebook through their mobile devices (250M) compared to 25% a year ago (100M out of 400M). 33% of Facebook posts are sent via mobile devices. (source)
  • Facebook’s top mobile client is m.facebook.com (Facebook’s mobile Web site) with 18% of total new Facebook posts. Android, iPhone, and Blackberry are next each with about 4% of total new Facebook posts. (source)
  • 50% of total active Twitter users are on multiple platforms (mobile) compared to 25% a year ago. 40% of all tweets are sent via mobile. (source)
  • Twitter’s top mobile client is m.twitter.com (Twitter’s mobile Web site) with 14% of total unique users. SMS is next with 8% of total unique users. Then Twitter for iPhone (8%) followed by Twitter for Blackberry (7%). (source)

For both sites, a much higher percentage of people are using the mobile website than any particular mobile app. On Twitter, even SMS is still ahead of the iPhone and Blackberry apps. Whether it’s because they’re on older phones or because they’re following links in an email or a search result (remember that links don’t open apps), it’s clear that just because you offer an app doesn’t mean your mobile web presence is any less important.

Don’t Obfuscate URLs on Your Website

Link shorteners have been around for years but their common use has exploded due to Twitter and it’s 140 character limit. Unfortunately, more and more content editors and bloggers have begun using them on their websites as well.

Sure, nobody likes a 200 character long, unintelligible URL. Your website visitors like being surprised by where they end up after clicking a link even less.

Providing your site visitors with information on the domain and page a link will take them to is a critical user experience best practice. The more they know about what they’re about to click on the more likely they are to actually click the link. Which link to this blog’s Twitter archives shown below are you more likely to click on?

Full length: https://www.brianbehrend.com/tag/­twitter/
Shortened: http://bit.ly/­ePBm1U

The full path of a link informs the visitor about the destination domain and page name, along with some basic info about where the page is within the site and what type of file they’re about to load. The only thing worse than using shortened links on your website would be using a shortened link to surprise a visitor with a 65 megabyte PDF that crashes their browser.

Tiny URLs have their place, but it’s not on your website.

Five Podcasts for People Who Work on the Web

The web changes every day.

The best marketing ideas or the coolest programming techniques in January can be outdated by November. My desk is littered with fantastic books on web development design and my Delicious and Instapaper accounts are full of great articles on content strategy and user experience, but one of the best ways to learn about the latest strategies on the web is by listening to podcasts.

In less than an hour while on your way to work you can keep up to date on the latest web trends. If you’re not already tuned in, read on to learn about five great podcasts that you should be listening to.

Continue reading “Five Podcasts for People Who Work on the Web”