Preferences

m-i-l
Joined 3,941 karma
Personal website: https://michael-lewis.com/

Side project: https://searchmysite.net/


  1. I think this is missing the point - it is a bit like saying "you only ever notice bad fraud, if the fraud is well done you never notice it" - the point is what it is, not whether you notice it or not. With AI in films at the moment there are still people behind, and reviewing, the AI output, so it is just another creative tool, which is fine. However, if someone were to generate an entire 90 minute film and put it online without even having the decency to spend 90 minutes of their own time watching it themselves first, that would not be fine. But that is happening with AI slop on the internet now. Whether it is any good or not is not the point - the point is that it is disrespectful of people's time and attention.
  2. Thanks for the great feedback:-) This is what searchmysite.net is attempting to do - help make "surfing the web" a fun leisure activity once more. It is good to see more people seem to get that point now. When it was on HN nearly 3 years ago[0], many people saw a search box and thought it must be a Google replacement, but were disappointed to find it wasn't. And I guess now more than ever it is useful to have a way of finding content on the web which has been made by humans rather than AI.

    [0] https://www.hackerneue.com/item?id=31395231

  3. At a big corporate, we had an Apache Solr based search which had some reasonably clever lemmatization and stats analysis and spell check config to suggest alternative searches if not many results were found for the original query, but one day someone reported an unfortunate edge case which caused a bit of a panic - if you searched "annual report” it returned "did you mean anal report?" (we were in the finance sector rather than medical sector, but there were a lot more documents in the corpus containing words like analysts, analysis, analytics etc). Anyway, the point is yes, it is great to have that sort of functionality, but it does come at a cost, and a small project like this might prefer to keep it simple.
  4. That's right. Most search engines are funded by advertising, where there is the clear conflict of interest[0], not to mention incentive for spam etc. Alternative models include a subscription fee (which I don't think would work for a small niche search like this) and donations (which may or may not be sustainable). Looking through some of the support forums for the big search engines, I'm pretty sure that enough site owners would pay a fee for support to pay the running costs for a large search engine, although for a smaller search engine like this there needs to be something more than just support, hence the search as a service features.

    [0] "Advertising funded search engines will be inherently biased towards the advertisers and away from the needs of consumers", to quote Sergey Brin and Lawrence Page in their "The Anatomy of a Large-Scale Hypertextual Web Search Engine" paper from 1998.

  5. The LLM was for an experiment in retrieval augmented generation, i.e. "a chat with your website" style interface, using Apache Solr as the vector store. Results (on a small self-hosted LLM to keep costs manageable) weren't good enough for the functionality to be fully rolled out, so the LLM has been disabled and is likely to be fully removed.
  6. Postgres is just used for the site admin, i.e. keeping track of submissions, review status, subscriptions etc. The actual search index is in Apache Solr. In theory you could use Solr to store all the admin data, but it is generally not recommended to use a Solr style document store to master data. I guess something more lightweight like SQLite could be used, but it is intended to be deployed on servers and Postgres isn't too resource intensive.
  7. A couple of references to the Nazis, but no reference to the Nazi book burnings, an incredibly symbolic physical manifestation of knowledge and information destruction, which I'd have thought would be very relevant in this context, i.e. in the praise of physical books? Perhaps it wasn't mentioned because it doesn't quite fit in with the narrative of digital being all bad, given digital knowlege can be more resistant to suppression and physical destruction.

    Also some great quotes from 30 years ago, e.g. Carl Sagan's "when awesome technological powers are in the hands of the very few" the nation would “slide, almost without noticing, back into superstition and darkness". But did it actually have to end up this way? And is it still possible (with enough collective will power) to push Big Tech profiteering back enough to deliver some of the society enhancing changes originally envisioned in the mid-1990s? Just as it took decades for the full positive implications of the invention of the printing press to come to fruition, perhaps we still need more time before we decry the internet as a net negative?

  8. My children were given a soft toy a few years back from a relative who had bought it from a Chinese street market while on holiday in China. When it was switched on it jumped about frantically and sang a very loud and shrill song. Not 100% sure which language it was, but it is entirely possible it was some form of Chinese street music, and certainly fits the article's description of "Mainland Chinese recordings" as "shouty, harsh and ear-piercing". Normally my children love things that adults find annoying, but even they were afraid of this one.
  9. Those are the good bots, which say who they are, probably respect robots.txt, and appear on various known bot lists. They are easy to deal with if you really want. But in my experience it is the bad bots you're more likely to want to deal with, and those can be very difficult, e.g. pretending to be browsers, coming from residential IP proxy farms, mutating their fingerprint too fast to appear on any known bot lists, etc.
  10. I used to work in an office which briefly had a commercially available 4KW microwave in the coffee area. I used to like it because it was fast. Unfortunately several other people failed to appreciate that you had to take the 800W timings and divide by 5, and it was quickly removed after several people set fire to their food.
  11. This matches my experience. I ran one of my side-projects on AWS for a couple of years before switching to Hetzner - AWS was around £35 a month while Hetzner was around £7 a month, so Hetzner was around 80% cheaper for an equivalent service[0]. The other big thing was all the little costs in AWS - it took 2 months to get the AWS bill down to £0 due to all the hidden extras like backups and Elastic IP address.

    [0] Full details at https://blog.searchmysite.net/posts/migrating-off-aws-has-re...

  12. > Who here thinks twitter is a platform for rational discourse?

    That's the elephant in the room here. The site formerly known as Twitter is optimised to maximise engagement, and conflict typically generates much more engagement than co-operation. It'd be like trying to have a friendly discussion to work out your differences with your opponent in a boxing ring, surrounded by large crowd who have been whipped up by the venue into baying for a fight. I sometimes wonder if it is even possible to build a sustainable internet platform which somehow rewards cordial good faith discourse and penalises the mean and intolerant (and by sustainable I mean immune to the tendency for these platforms to eventually pivot to maximising profits above all else).

  13. > "Postman even mentioned the fact in his 1995 book "The end of education""

    Quote from Postman according to wikipedia[0]:

    "the level of sensibility required to appreciate the music of Roger Waters is both different and lower than what is required to appreciate, let us say, a Chopin étude."

    Ouch.

    I actually got the album when it came out, and was roughly aware of the concept and the book from reviews in the music press. Had I known that it was comparing Orwell and Huxley I'd have definitely made the effort to read more. But this was before the internet so it wasn't easy (you had to do things like going to a public library), so technological progress is not all downside.

    [0] https://en.wikipedia.org/wiki/Amusing_Ourselves_to_Death

  14. If some people are offended by phone use, and no people are offended by non-phone use, then I'd have thought the default (assuming you don't want to check or to offend) would simply be non-phone use.
  15. I'm originally from Shetland, but moved to the Scottish Borders when I was young, and do remember people had a very hard time understanding my Shetland dialect (which I had to lose pretty quickly to communicate locally).

    And to the original point, neither Shetland nor the Scottish Borders have ever had any Gaelic influence at any point in their history, and recent attempts to claim otherwise tend not to go down too well with the locals.

  16. > Scotland and Ireland are intimately linked by a common language (Gaelic)

    Just on parts of the west coast of Scotland - much of mainland Scotland spoke the Scots language, with the (now dead) Norn language spoken in the northern areas (with Norse heritage), and other languages in the border areas. The promotion of Scottish Gaelic as a "national language" is very much modern-day myth-building.

  17. There's another use case towards the end: "Transparent displays could have a place on the desktop—not so you can see through them, but so that a camera can sit behind the display, capturing your image while you’re looking directly at the screen. This would help you maintain eye contact during a Zoom call."

    Personally I think that would be a great idea. But from my (non expert) perspective, for eye contact in video calls, I'd have thought a single camera behind a transparent 30" monitor would be little better than a single camera ontop of a 30" monitor, given that you may have multiple faces on screen and have those faces in different places on screen. Maybe a matrix of cameras behind a transparent screen, with gaze detection and eye tracking to determine which to activate at any time?

  18. > "50k windows-based endpoints or so. All down."

    I'm a dev rather than infra guy, but I'm pretty sure everywhere I've worked which has a large server estate has always done rolling patch updates, i.e. over multiple days (if critical) or multiple weekends (if routine), not blast every single machine everywhere all at once.

  19. > "We’ll soon have near-AGI intelligences (GPT-5)"

    Does anyone technical believe GPT-5 will be even remotely close to anything which has even a vague resemblance to AGI?

  20. Surprised to see people talking about spudguns firing whole potatoes! I didn't even know that was a thing. The spudgun I had when I was young just fired small parts of a potato, with a whole potato enough to last for ages. I think it was the Lone Star Spudmatic. Not only safer but less wasteful.

    FWIW, it looks like Wikipedia differentiates the smaller spudgun from the larger potato cannon: https://en.wikipedia.org/wiki/Spud_gun vs https://en.wikipedia.org/wiki/Potato_cannon

  21. > "Price. By far."

    In the UK, the Atari ST was originally £299 while the Amiga was £499, so it's not really fair to compare the two. I was able to save up for an ST with various weekend and school holiday jobs, but there's no way I could have saved up for an Amiga (and the only classmates I knew with Amigas had their parents buy it for them).

  22. Throughout the video I was wondering what possible practical applications there could be. I got it at the end: "we use this effect to engage people who are otherwise not so interested in science".
  23. In 1991 or 1992 I used POV-Ray on my Atari ST to create some title screens for some home videos. Completely gratuitous marble text infront of a glass ball on top of water type of stuff, which took all night to render, but it was fun, and crucially free. For years I'd looked enviously at Cyber Studio for the Atari ST, with its StereoTek liquid crystal shutter 3D glasses add-on, but it was just too expensive for me at the time.

    Then in 1996 or 1997 I thought it would be fun to use it in a professional context at the software company I worked at, making a 3D animated GIF version of one of the product logos which I put on the web site (FWIW it looks like the 3D non-animated version is still visible on the Internet Archive Way Back Machine at https://web.archive.org/web/19971211003918/http://www.sophos... 27 years later). Although no-one had asked for it, I was still in effect getting paid to do something I used to do for fun, which felt good.

  24. Location: London, UK

    Remote: Yes, but pref hybrid or onsite

    Willing to relocate: No

    Technologies: Languages: Python (esp Flask), JavaScript (vanilla, some React and Vue), Java. AI/ML: PyTorch, LangChain, Tensorflow. Platforms: Linux, Docker, OpenShift (Kubernetes), AWS. Data: PostgreSQL, SQL Server, Hadoop. Other: Apache Solr, Adobe Experience Manager.

    Résumé/CV: https://www.linkedin.com/in/michaelianlewis/

    Email: michael at michael-lewis dot com

    About me: I'm a seasoned tech leader who enjoys building web and mobile apps and aiming for satisfied customers. I have both the tech skills to help define and create the most appropriate solution, and the leadership skills to maximise the chance of successful delivery. Tech experience includes full stack dev, architecture and AI/ML. Leadership experience includes leading departments, manager of managers, and matrix managing geographically dispersed cross-functional teams. Worked in organisations ranging from a software startup to a large multinational enterprise. Might also be known to some as the developer of a boutique internet search engine.

  25. Also the Abbey Road crossing webcam at https://www.abbeyroad.com/crossing . Not so busy at night, but plenty of people trying to take their photos during the day.
  26. As much as I like "digital gardens" and think that they are a step in the right direction, rewilding requires much more than cultivating lots of isolated patches of ground - the article even mentions a nature reserve which was "too small and too disconnected to be rewilded. Its effectively landlocked status made over-grazing and collapse inevitable".
  27. > "I have some SACDs but can't actually play them, even though my dac can do the audio. Is there some way to put the tracks on a computer?"

    As per the other comments, it is technically possible, but not exactly straightforward: if you have access to a compatible BluRay player, you can create a special bootable USB stick, boot the player from that, connect to the player over the network, and use an SACD client to extract the ISO and/or DSF files containing the DSD data. You probably won't be able to play these directly on your media server, but you can use ffmpeg to convert to e.g. 24-bit/88.2kHz FLAC files. If you're interested, I wrote up the process I followed (with links to original sources) at https://www.michael-lewis.com/posts/extracting-multichannel-... .

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal