Skip to content

[Issue]: Accessible Interface for Autogen Studio & updated multimodal agents #1519

@Josephrp

Description

@Josephrp

Describe the issue

This is specific to Autogen Studio following my discussion with folks working with autogen :-)

Create a fully accessible interface for autogen studio

It's very important to me that tools i use are fully accessible. That often means multimodality for user inputs.

  • in that sense multimodal on-device models are very useful.

User Input

  • image , audio and text inputs using on device models
  • audio and text outputs for autogen studio returns

Multimodal agents

The current examples of multimodal agents have not taken advantage of llava plus yet. it's a great opportunity to review and update multimodal agents and demonstrate them in context.

Requirements

Autogen Studio

  • audio input / output
  • image input

Blog : Autogen Studio with on device multimodal agents

Multimodal Agent Notebook Image Agent

  • Simple image agent that can parse image inputs in 2-way chats
  • Complex image agent on-device model & tools demo

Multimodal Agent Nnotebook Audio Agent(s) :

  • simple audio agent that can audio to text
  • complex audio demo that can text to studio :-)

Linked Issues :

My Linked Repo :

Autogen Community Contributors !

Hey we're all just doing our best to push our cool demos and ideas upstream, the best for me is to meet like minded contributors in order to co-create the accessible interface we want to use ;-) and also organise it a bit cleanly with "my linked repo" but:

  • that said, dont be shy to just contribute to this issue is you own branch :-)

Steps to reproduce

  • open autogen studio , cannot type : need audio
  • open autogen studio , you're 4.5 years of age : need image to text
  • open autogen studio , you're driving , cannot take the laptop to read the output : need text to speech

Screenshots and logs

No response

Additional Information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions