Vertex AI Matching Engine - Vector Similarity Search

preview_player
Показать описание
Putting a similarity index into production at scale is a pretty hard challenge. It requires a whole bunch of infrastructure working closely together. You need to handle a large amount of data at low latency. It introduces you to topics like sharding, hashing, trees, load balancing, efficient data transfer, data replication, and much more.

Check out the notebook and the article on how to get started with Google Cloud Vertex AI Matching Engine

If you enjoyed this video, please subscribe to the channel ❤️

🎉 Subscribe for Article and Video Updates!

You can find me here:

If you or your company is looking for advice on the cloud or ML, check out the company I work for.
We offer consulting, workshops, and training at zero cost. Imagine an extension for your team without additional costs.

#vertexai #googlecloud #machinelearning #mlengineer #doit

▬ My current recording equipment ▬▬▬▬▬▬▬▬

Support my channel if you buy with those links on Amazon

▬ Timestamps ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

00:00 Introduction
00:32 Statement
00:47 Use Cases
01:25 Embedding
01:47 Input
02:23 Types
02:54 VPC
04:05 Create Embeddings
06:50 Setup
07:00 VPC Setup
08:39 Create Index
11:58 Create Endpoint
13:00 Deploy Index
14:23 Update Index
15:10 Scale Index
16:46 Query
22:23 Bye
Рекомендации по теме
Комментарии
Автор

Nice tutorial. Matching engine is really promising but it does require some setup, I will try to reproduce this tutorial and see what happens.

jobiquirobi
Автор

Amazing thank you!

I'm really keen to see that video about how to use Cloud Run to make the Vertex AI Endpoint more accessible, did you end up making that video?

fwbjjjx
Автор

Great tutorial! Can you update algorithm parameters like leafNodeEmbeddingCount and leafNodesToSearchPercent on the fly? I tried using the gcloud update index command, but nothing changes when I describe the index afterward, even when the operation is complete

tyronehou
Автор

Great content. Can u tell me about some alternatives? I am studying some options such as using pgvector with some model to generate embedding VS matching engine.

I would like to understand pros /cons about those approaches

LucasGomide
Автор

Thanks for tutorial. Is there any Langchain compatible retriever for this matching engine index ?

ramsure
Автор

Thank you for the tutorial!
Is it possible to choose the machine type? I tried with 100 vectors (94 kb), and in the endpoint's basic info I see machine-type: n1-standard-16. In the documentation it seems that there is a default machine based on shard size. The documentation says: "When you create an index you must specify the shard size of the index", but there is no parameter that refers to shard size during Index creation. There is also written "you can determine what machine type to use when you deploy your index" but, same as before, there is no parameter that refers to machine-type. I am a bit confused :/

federicoph
Автор

Hey awesome starter, just a question, given i have a index created with a bucket, if i were to add new files to the same bucket, will the index reflect the new data files, either by itself or even by triggering ? or simply put, how can i add new data from a bucket to an existing index without rebuilding entire index again, something equivalent of pinecone or weaviate upsert functionalities ? the docs arent helping me here

MOHAMMADAUSAF
Автор

Thank you for the tutorial. With the avro format there is an allow and deny option that you can set for the embeddings inserted. There is little documentation as to how to use this in a query. Could you help with this?

anjanak
Автор

Thanks for the walkthrough. The documentation from GCP is quite messy
It doesn't seem to have great support for metadata filtering compared to other stores, only very basic operations. Any thoughts from your experience?

alexchan
Автор

Hi,
How can I make it work from outside the network?, I mean send a request and get a response from out side the network ?

nooralsmadi
Автор

Hi,
Can you please help me understand how to orchastretate vertex AI through cloud composer

kadapa-rljg
Автор

where did you mention the schema of the data file(the one with input embedding vector)?

akarshjainable
Автор

Can I do a batch prediction on index, if Yes, Do I need a vpc network for that?

akarshjainable
Автор

Hey what is the parameter that decides the number of neighbours returned? I tried changing num_neighbours to no avail. it only returns 10 neighbours

majidalikhani
Автор

What is need to endpoints ?
When u will be uploading more videos ?

AyushMandloi
Автор

Great tutorial! How does the deny list work?
Let's say I have a class fruit which will ONLY have deny list tokens (no allow) such as "apple", "mango", etc. How do I filter out "mango" in the query (search all fruits except mango)?

I have tried the following method but it does not work as expected

json
{"id": "1", "embedding":[0.002792, 0.000492], "restricts": [{"namespace": "fruit", "deny": ["mango"]}]}

query
deny_namespace = match_service_pb2.Namespace()
deny_namespace.name = "fruit"


niladrishekhardutt