Preferences

siim
Joined 699 karma
github.com/Siim x.com/siimh

  1. Speaking > typing for creation.

    Reading > listening for consumption.

    Talk to create, read to consume.

  2. Curious what made you think the backend uses LLMs for content generation?

    To clarify:

    1. transcription is local VOSK speech-to-text via WebSocket

    2. live transcript post-processing has optional Gemini Flash-lite turned on which tries to fix obvious transcription mistakes, nothing else. The real fix here is more accurate transcriber.

    3. backend: TypeGraphQL + MongoDB + Redis

    The anti-AI stance isn't "zero AI anywhere", it's about requiring human input.

    AI-generated audio is either too bad or too perfect. Real recorded voice has human imperfections.

  3. That's a good call. While there's no general public feed, individual profiles are public. For example, here's mine: https://voxconvo.com/siim
  4. True. However making voice input has higher friction than typing chatgpt write me a reply.
  5. I'm working on https://X11.Social, a voice-first content creation tool for X.

    The initial idea was "call to tweet", the ability to compose posts on the go by having a natural conversation with an AI assistant over a simple phone call. This is useful for turning thoughts from a walk or drive into a polished "brain dump" post, or for engaging with user lists without being at a computer.

    It has since evolved into a broader system:

    Chrome Extension: A context-aware assistant that lives in the browser. It has a Quake-style console (activated by opt+space) for quick chat and can analyze the content of any page you're on (e.g., YouTube transcripts, articles, other tweets) to help you create relevant content.

    Engagement Predictor: A feature that scores tweet drafts in real-time to predict their potential for engagement. It's built on a model trained on my own dataset pulled from the X API and other public dataset from Kaggle[0].

    Scheduled AI Calls: The system can call you on a predefined schedule to proactively brainstorm content ideas.

    Here is the tech stack:

    - Frontend: React, Tailwind, shadcn/ui

    - Auth: X OAuth

    - Payments: Stripe Subscriptions

    - Voice AI: ElevenLabs Conversational AI, Twilio

    - Engagement Predictor ML: Python, scikit-learn, XGBoost on a data pipeline from X API v2 and a base dataset from Kaggle.

    - Chrome Extension: Same as Frontend and Chrome Extensions API

    - Blog: Jekyll

    - Infrastructure: Deployed on AWS Fargate using AWS Copilot for orchestration (ECS).

    I'm building solo and just got the first trial user after 87 days of building in public. It's a long road but the feedback so far is encouraging.

    [0] https://www.kaggle.com/code/shpatrickguo/tweet-virality-pred...

  6. Fair. X11 is ElevenLabs-inspired voice tech, X for the platform + 11 for AI voice.

    I kept the name for the call-to-tweet vision. Thoughts on the demo?

  7. After looking into to the code I found out that this app is made by using Voronoi diagrams. [1]

    The actual positions are saved in a json file. [2]

    [1] http://en.wikipedia.org/wiki/Voronoi_diagram

    [2]http://www.pointerpointer.com/gridPositions.json

  8. Edicy is a nice CMS solution with in-line editing. http://www.edicy.com/
  9. I found a quick 15 page introduction to Scala and took me about an hour to digest it (of course I didn't dive in very deeply). It gave me a sufficient knowledge to understand the article about monads.

    So here it is: http://www.scala-lang.org/docu/files/ScalaTutorial.pdf

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal