Skip to content

[IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling #143986

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/svkeerthy/06-10-_mlininer_ir2vec_integrating_ir2vec_with_mlinliner
Choose a base branch
from

Conversation

svkeerthy
Copy link
Contributor

@svkeerthy svkeerthy commented Jun 12, 2025

Changes to scale opcodes, types and args once in IR2VecVocabAnalysis so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings.

(Tracking issue - #141817 ; partly fixes - #141832)

Copy link
Contributor Author

svkeerthy commented Jun 12, 2025

@svkeerthy svkeerthy changed the title [IR2Vec] Scale vocab [IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling Jun 12, 2025
@svkeerthy svkeerthy marked this pull request as ready for review June 12, 2025 22:45
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2025

@llvm/pr-subscribers-mlgo

@llvm/pr-subscribers-llvm-analysis

Author: S. VenkataKeerthy (svkeerthy)

Changes

Changes to scale opcodes, types and args once in IR2VecVocabAnalysis so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings.

(Tracking issue - #141817 ; partly fixes - #141832)


Patch is 149.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/143986.diff

16 Files Affected:

  • (modified) llvm/include/llvm/Analysis/IR2Vec.h (+15-1)
  • (modified) llvm/lib/Analysis/IR2Vec.cpp (+102-39)
  • (modified) llvm/lib/Analysis/models/seedEmbeddingVocab75D.json (+70-63)
  • (modified) llvm/lib/Passes/PassRegistry.def (+24-17)
  • (added) llvm/test/Analysis/IR2Vec/Inputs/dummy_2D_vocab.json (+11)
  • (modified) llvm/test/Analysis/IR2Vec/Inputs/dummy_3D_vocab.json (+13-5)
  • (modified) llvm/test/Analysis/IR2Vec/Inputs/dummy_5D_vocab.json (+15-9)
  • (added) llvm/test/Analysis/IR2Vec/Inputs/incorrect_vocab1.json (+11)
  • (added) llvm/test/Analysis/IR2Vec/Inputs/incorrect_vocab2.json (+12)
  • (added) llvm/test/Analysis/IR2Vec/Inputs/incorrect_vocab3.json (+12)
  • (added) llvm/test/Analysis/IR2Vec/Inputs/incorrect_vocab4.json (+16)
  • (modified) llvm/test/Analysis/IR2Vec/basic.ll (+13-1)
  • (added) llvm/test/Analysis/IR2Vec/dbg-inst.ll (+13)
  • (added) llvm/test/Analysis/IR2Vec/unreachable.ll (+42)
  • (added) llvm/test/Analysis/IR2Vec/vocab-test.ll (+20)
  • (modified) llvm/unittests/Analysis/IR2VecTest.cpp (+6)
diff --git a/llvm/include/llvm/Analysis/IR2Vec.h b/llvm/include/llvm/Analysis/IR2Vec.h
index de67955d85d7c..f1aaf4cd2e013 100644
--- a/llvm/include/llvm/Analysis/IR2Vec.h
+++ b/llvm/include/llvm/Analysis/IR2Vec.h
@@ -108,6 +108,7 @@ struct Embedding {
   /// Arithmetic operators
   Embedding &operator+=(const Embedding &RHS);
   Embedding &operator-=(const Embedding &RHS);
+  Embedding &operator*=(double Factor);
 
   /// Adds Src Embedding scaled by Factor with the called Embedding.
   /// Called_Embedding += Src * Factor
@@ -116,6 +117,8 @@ struct Embedding {
   /// Returns true if the embedding is approximately equal to the RHS embedding
   /// within the specified tolerance.
   bool approximatelyEquals(const Embedding &RHS, double Tolerance = 1e-6) const;
+
+  void print(raw_ostream &OS) const;
 };
 
 using InstEmbeddingsMap = DenseMap<const Instruction *, Embedding>;
@@ -234,6 +237,8 @@ class IR2VecVocabResult {
 class IR2VecVocabAnalysis : public AnalysisInfoMixin<IR2VecVocabAnalysis> {
   ir2vec::Vocab Vocabulary;
   Error readVocabulary();
+  Error parseVocabSection(const char *Key, const json::Value ParsedVocabValue,
+                          ir2vec::Vocab &TargetVocab, unsigned &Dim);
   void emitError(Error Err, LLVMContext &Ctx);
 
 public:
@@ -249,7 +254,6 @@ class IR2VecVocabAnalysis : public AnalysisInfoMixin<IR2VecVocabAnalysis> {
 /// functions.
 class IR2VecPrinterPass : public PassInfoMixin<IR2VecPrinterPass> {
   raw_ostream &OS;
-  void printVector(const ir2vec::Embedding &Vec) const;
 
 public:
   explicit IR2VecPrinterPass(raw_ostream &OS) : OS(OS) {}
@@ -257,6 +261,16 @@ class IR2VecPrinterPass : public PassInfoMixin<IR2VecPrinterPass> {
   static bool isRequired() { return true; }
 };
 
+/// This pass prints the embeddings in the vocabulary
+class IR2VecVocabPrinterPass : public PassInfoMixin<IR2VecVocabPrinterPass> {
+  raw_ostream &OS;
+
+public:
+  explicit IR2VecVocabPrinterPass(raw_ostream &OS) : OS(OS) {}
+  PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM);
+  static bool isRequired() { return true; }
+};
+
 } // namespace llvm
 
 #endif // LLVM_ANALYSIS_IR2VEC_H
diff --git a/llvm/lib/Analysis/IR2Vec.cpp b/llvm/lib/Analysis/IR2Vec.cpp
index fa38c35796a0e..f51d3252d6606 100644
--- a/llvm/lib/Analysis/IR2Vec.cpp
+++ b/llvm/lib/Analysis/IR2Vec.cpp
@@ -85,6 +85,12 @@ Embedding &Embedding::operator-=(const Embedding &RHS) {
   return *this;
 }
 
+Embedding &Embedding::operator*=(double Factor) {
+  std::transform(this->begin(), this->end(), this->begin(),
+                 [Factor](double Elem) { return Elem * Factor; });
+  return *this;
+}
+
 Embedding &Embedding::scaleAndAdd(const Embedding &Src, float Factor) {
   assert(this->size() == Src.size() && "Vectors must have the same dimension");
   for (size_t Itr = 0; Itr < this->size(); ++Itr)
@@ -101,6 +107,13 @@ bool Embedding::approximatelyEquals(const Embedding &RHS,
   return true;
 }
 
+void Embedding::print(raw_ostream &OS) const {
+  OS << " [";
+  for (const auto &Elem : Data)
+    OS << " " << format("%.2f", Elem) << " ";
+  OS << "]\n";
+}
+
 // ==----------------------------------------------------------------------===//
 // Embedder and its subclasses
 //===----------------------------------------------------------------------===//
@@ -196,18 +209,12 @@ void SymbolicEmbedder::computeEmbeddings(const BasicBlock &BB) const {
   for (const auto &I : BB.instructionsWithoutDebug()) {
     Embedding InstVector(Dimension, 0);
 
-    const auto OpcVec = lookupVocab(I.getOpcodeName());
-    InstVector.scaleAndAdd(OpcVec, OpcWeight);
-
     // FIXME: Currently lookups are string based. Use numeric Keys
     // for efficiency.
-    const auto Type = I.getType();
-    const auto TypeVec = getTypeEmbedding(Type);
-    InstVector.scaleAndAdd(TypeVec, TypeWeight);
-
+    InstVector += lookupVocab(I.getOpcodeName());
+    InstVector += getTypeEmbedding(I.getType());
     for (const auto &Op : I.operands()) {
-      const auto OperandVec = getOperandEmbedding(Op.get());
-      InstVector.scaleAndAdd(OperandVec, ArgWeight);
+      InstVector += getOperandEmbedding(Op.get());
     }
     InstVecMap[&I] = InstVector;
     BBVector += InstVector;
@@ -251,6 +258,47 @@ bool IR2VecVocabResult::invalidate(
   return !(PAC.preservedWhenStateless());
 }
 
+Error IR2VecVocabAnalysis::parseVocabSection(const char *Key,
+                                             const json::Value ParsedVocabValue,
+                                             ir2vec::Vocab &TargetVocab,
+                                             unsigned &Dim) {
+  assert(Key && "Key cannot be null");
+
+  json::Path::Root Path("");
+  const json::Object *RootObj = ParsedVocabValue.getAsObject();
+  if (!RootObj) {
+    return createStringError(errc::invalid_argument,
+                             "JSON root is not an object");
+  }
+
+  const json::Value *SectionValue = RootObj->get(Key);
+  if (!SectionValue)
+    return createStringError(errc::invalid_argument,
+                             "Missing '" + std::string(Key) +
+                                 "' section in vocabulary file");
+  if (!json::fromJSON(*SectionValue, TargetVocab, Path))
+    return createStringError(errc::illegal_byte_sequence,
+                             "Unable to parse '" + std::string(Key) +
+                                 "' section from vocabulary");
+
+  Dim = TargetVocab.begin()->second.size();
+  if (Dim == 0)
+    return createStringError(errc::illegal_byte_sequence,
+                             "Dimension of '" + std::string(Key) +
+                                 "' section of the vocabulary is zero");
+
+  if (!std::all_of(TargetVocab.begin(), TargetVocab.end(),
+                   [Dim](const std::pair<StringRef, Embedding> &Entry) {
+                     return Entry.second.size() == Dim;
+                   }))
+    return createStringError(
+        errc::illegal_byte_sequence,
+        "All vectors in the '" + std::string(Key) +
+            "' section of the vocabulary are not of the same dimension");
+
+  return Error::success();
+};
+
 // FIXME: Make this optional. We can avoid file reads
 // by auto-generating a default vocabulary during the build time.
 Error IR2VecVocabAnalysis::readVocabulary() {
@@ -259,32 +307,40 @@ Error IR2VecVocabAnalysis::readVocabulary() {
     return createFileError(VocabFile, BufOrError.getError());
 
   auto Content = BufOrError.get()->getBuffer();
-  json::Path::Root Path("");
+
   Expected<json::Value> ParsedVocabValue = json::parse(Content);
   if (!ParsedVocabValue)
     return ParsedVocabValue.takeError();
 
-  bool Res = json::fromJSON(*ParsedVocabValue, Vocabulary, Path);
-  if (!Res)
-    return createStringError(errc::illegal_byte_sequence,
-                             "Unable to parse the vocabulary");
+  ir2vec::Vocab OpcodeVocab, TypeVocab, ArgVocab;
+  unsigned OpcodeDim, TypeDim, ArgDim;
+  if (auto Err = parseVocabSection("Opcodes", *ParsedVocabValue, OpcodeVocab,
+                                   OpcodeDim))
+    return Err;
 
-  if (Vocabulary.empty())
-    return createStringError(errc::illegal_byte_sequence,
-                             "Vocabulary is empty");
+  if (auto Err =
+          parseVocabSection("Types", *ParsedVocabValue, TypeVocab, TypeDim))
+    return Err;
 
-  unsigned Dim = Vocabulary.begin()->second.size();
-  if (Dim == 0)
+  if (auto Err =
+          parseVocabSection("Arguments", *ParsedVocabValue, ArgVocab, ArgDim))
+    return Err;
+
+  if (!(OpcodeDim == TypeDim && TypeDim == ArgDim))
     return createStringError(errc::illegal_byte_sequence,
-                             "Dimension of vocabulary is zero");
+                             "Vocabulary sections have different dimensions");
 
-  if (!std::all_of(Vocabulary.begin(), Vocabulary.end(),
-                   [Dim](const std::pair<StringRef, Embedding> &Entry) {
-                     return Entry.second.size() == Dim;
-                   }))
-    return createStringError(
-        errc::illegal_byte_sequence,
-        "All vectors in the vocabulary are not of the same dimension");
+  auto scaleVocabSection = [](ir2vec::Vocab &Vocab, double Weight) {
+    for (auto &Entry : Vocab)
+      Entry.second *= Weight;
+  };
+  scaleVocabSection(OpcodeVocab, OpcWeight);
+  scaleVocabSection(TypeVocab, TypeWeight);
+  scaleVocabSection(ArgVocab, ArgWeight);
+
+  Vocabulary.insert(OpcodeVocab.begin(), OpcodeVocab.end());
+  Vocabulary.insert(TypeVocab.begin(), TypeVocab.end());
+  Vocabulary.insert(ArgVocab.begin(), ArgVocab.end());
 
   return Error::success();
 }
@@ -304,7 +360,7 @@ void IR2VecVocabAnalysis::emitError(Error Err, LLVMContext &Ctx) {
 IR2VecVocabAnalysis::Result
 IR2VecVocabAnalysis::run(Module &M, ModuleAnalysisManager &AM) {
   auto Ctx = &M.getContext();
-  // FIXME: Scale the vocabulary once. This would avoid scaling per use later.
+
   // If vocabulary is already populated by the constructor, use it.
   if (!Vocabulary.empty())
     return IR2VecVocabResult(std::move(Vocabulary));
@@ -323,16 +379,9 @@ IR2VecVocabAnalysis::run(Module &M, ModuleAnalysisManager &AM) {
 }
 
 // ==----------------------------------------------------------------------===//
-// IR2VecPrinterPass
+// Printer Passes
 //===----------------------------------------------------------------------===//
 
-void IR2VecPrinterPass::printVector(const Embedding &Vec) const {
-  OS << " [";
-  for (const auto &Elem : Vec)
-    OS << " " << format("%.2f", Elem) << " ";
-  OS << "]\n";
-}
-
 PreservedAnalyses IR2VecPrinterPass::run(Module &M,
                                          ModuleAnalysisManager &MAM) {
   auto IR2VecVocabResult = MAM.getResult<IR2VecVocabAnalysis>(M);
@@ -353,7 +402,7 @@ PreservedAnalyses IR2VecPrinterPass::run(Module &M,
 
     OS << "IR2Vec embeddings for function " << F.getName() << ":\n";
     OS << "Function vector: ";
-    printVector(Emb->getFunctionVector());
+    Emb->getFunctionVector().print(OS);
 
     OS << "Basic block vectors:\n";
     const auto &BBMap = Emb->getBBVecMap();
@@ -361,7 +410,7 @@ PreservedAnalyses IR2VecPrinterPass::run(Module &M,
       auto It = BBMap.find(&BB);
       if (It != BBMap.end()) {
         OS << "Basic block: " << BB.getName() << ":\n";
-        printVector(It->second);
+        It->second.print(OS);
       }
     }
 
@@ -373,10 +422,24 @@ PreservedAnalyses IR2VecPrinterPass::run(Module &M,
         if (It != InstMap.end()) {
           OS << "Instruction: ";
           I.print(OS);
-          printVector(It->second);
+          It->second.print(OS);
         }
       }
     }
   }
   return PreservedAnalyses::all();
 }
+
+PreservedAnalyses IR2VecVocabPrinterPass::run(Module &M,
+                                              ModuleAnalysisManager &MAM) {
+  auto IR2VecVocabResult = MAM.getResult<IR2VecVocabAnalysis>(M);
+  assert(IR2VecVocabResult.isValid() && "IR2Vec Vocabulary is invalid");
+
+  auto Vocab = IR2VecVocabResult.getVocabulary();
+  for (const auto &Entry : Vocab) {
+    OS << "Key: " << Entry.first << ": ";
+    Entry.second.print(OS);
+  }
+
+  return PreservedAnalyses::all();
+}
\ No newline at end of file
diff --git a/llvm/lib/Analysis/models/seedEmbeddingVocab75D.json b/llvm/lib/Analysis/models/seedEmbeddingVocab75D.json
index 65ca240deba13..a0d539afd5294 100644
--- a/llvm/lib/Analysis/models/seedEmbeddingVocab75D.json
+++ b/llvm/lib/Analysis/models/seedEmbeddingVocab75D.json
@@ -1,65 +1,72 @@
 {
-"add":[0.09571152, 0.08334279, 0.07070262, 0.14084256, 0.04825167, -0.12893851, -0.12180215, -0.07194936, -0.05329844, -0.28276998, 0.00202254, 0.07488817, -0.16139749, 0.07618549, -0.06084673, 0.10536429, -0.03031596, 0.16402271, 0.22730531, -0.0015432, 0.01986631, -0.15808797, 0.03030382, -0.04201249, 0.01689876, -0.30041003, -0.06479812, 0.1282431, -0.09032831, 0.14007935, -0.23540203, 0.00552223, 0.15325953, 0.13147154, -0.15204638, 0.1616615, 0.05758473, -0.1385157, 0.16438901, -0.14482859, 0.07895052, 0.18407233, 0.15283448, -0.00081216, -0.17934133, -0.16779658, 0.09044863, -0.18453072, -0.00552684, 0.01191218, 0.18504514, 0.13140059, -0.0174882, -0.18127315, 0.02269725, -0.02048657, 0.10858727, 0.0074029, 0.09485064, -0.13431476, 0.07491371, -0.17498694, -0.32914585, -0.14159656, -0.03594542, 0.04091231, 0.00298631, -0.08933277, -0.07178984, 0.04144038, 0.12151367, -0.09504163, 0.13691336, 0.07825345, 0.13958281],
-"alloca":[0.12644756, -0.20912014, 0.07298578, 0.13963142, 0.03932726, -0.12778169, -0.13084207, -0.09090705, -0.01497083, 0.0646746, 0.14847252, 0.08546949, -0.11579383, 0.07812478, -0.16431178, 0.08578802, 0.06094746, -0.1239017, -0.02789196, -0.01074964, -0.1738938, -0.1629063, -0.07137732, -0.02845656, 0.1728318, 0.13779363, -0.06250989, 0.03479109, -0.08423121, 0.14009888, 0.09620709, -0.02287545, -0.04616966, -0.19591278, -0.19636007, 0.21825573, 0.07949521, -0.14614704, 0.17752613, -0.15099092, -0.04518029, 0.17774306, 0.18947133, -0.00197501, -0.12619424, -0.18458585, -0.09960615, 0.01162648, 0.21306573, 0.0468024, 0.201505, 0.09970672, 0.06599383, 0.17757463, -0.11899275, 0.10299123, 0.06038174, -0.07116731, -0.15743989, -0.1258558, 0.1535076, 0.01872367, -0.14336765, -0.19708565, -0.26796287, 0.02628843, -0.0009325, -0.08809827, -0.12970725, 0.17223337, -0.10651242, -0.09162157, 0.03264611, 0.0911452, 0.09506542],
-"and":[-0.12902537, 0.07848564, 0.07048505, 0.13578993, 0.03570583, -0.1242408, -0.12619697, -0.06914084, -0.0515872, 0.13225996, 0.01673339, 0.0763608, 0.18325369, 0.07690141, -0.09591792, 0.10353354, -0.02401061, 0.16493416, 0.21942355, -0.00400911, 0.03811587, -0.15406959, -0.07134877, -0.02946596, -0.03854588, 0.28656727, -0.06625008, 0.12780443, -0.04631379, 0.13919763, -0.15808977, -0.00312698, 0.14692454, 0.18495218, -0.14863448, 0.13449706, 0.06471325, -0.13883992, 0.17630532, -0.16833898, 0.07391811, 0.17909151, 0.18229873, -0.00381102, -0.17680968, -0.1645554, 0.10016466, -0.07963493, 0.00130218, 0.05646244, 0.18222143, 0.10511146, -0.0191175, -0.08713559, 0.25423968, -0.02557301, 0.04319789, 0.03259414, 0.07209402, -0.13169754, 0.07424775, -0.17216511, -0.32057068, -0.13833733, -0.0658454, 0.02420194, -0.04393166, -0.08238692, -0.07023077, 0.04014502, 0.20101993, -0.09093616, 0.13076238, 0.09114857, -0.06483845],
-"ashr":[0.16414718, -0.02078147, 0.07353837, 0.14210321, 0.08068018, -0.13586298, -0.15728961, -0.10365791, 0.00359197, 0.04608767, 0.35770142, 0.08003625, 0.00944533, 0.08471741, 0.05809571, 0.09202059, -0.18349345, 0.22169511, 0.24418135, 0.19192688, -0.1956347, -0.17401719, -0.07583863, -0.02781139, 0.0952112, -0.01444751, -0.27259794, 0.14392436, -0.13541143, 0.1410963, 0.04314299, -0.01641751, -0.05448177, -0.22287542, -0.2131822, 0.25112826, 0.06918604, -0.14414005, 0.060288, -0.09658266, -0.09122665, -0.02779044, 0.20349248, 0.0041391, 0.10610974, -0.20890503, -0.09881835, -0.07368057, 0.22467837, 0.00075097, 0.2147348, -0.02612463, -0.01696278, -0.0649786, 0.15041149, 0.02361087, 0.0211603, -0.03706622, 0.18296233, -0.14298625, 0.14614436, 0.02273145, 0.0209446, -0.21062987, 0.16911499, -0.03668665, 0.00197532, -0.09607943, -0.08398084, 0.02006913, 0.05739584, -0.07919859, 0.19634944, 0.11082727, -0.06584227],
-"bitcast":[0.1675692, -0.12870269, 0.04048132, -0.1670965, -0.1279611, 0.02615386, -0.16829294, -0.09034907, 0.10913523, -0.07819421, 0.23986322, -0.05966561, 0.08410738, 0.19072439, 0.06047394, 0.02999627, -0.16747619, -0.06076627, -0.02673951, -0.1619169, 0.06443421, -0.13788716, -0.05644303, 0.01361013, -0.06858975, -0.06005004, 0.10011288, -0.05508338, -0.10613093, -0.11281271, -0.00758647, -0.12425531, 0.05333719, -0.16881412, -0.20088236, 0.06015657, -0.16405901, 0.06226884, 0.09171099, -0.09500738, 0.07907875, 0.0776544, -0.18457057, -0.11278627, -0.12111131, -0.10180638, 0.18328871, 0.18770072, 0.20346186, 0.10305139, -0.18344335, 0.10693213, -0.10920919, -0.05994263, 0.20354497, -0.03093485, 0.14214055, 0.00580597, 0.10480052, -0.09955201, -0.134185, -0.02904563, 0.00175069, 0.17646657, 0.01348841, -0.02030338, -0.06537742, 0.10032237, 0.15315783, 0.0102667, 0.07717734, -0.01060431, 0.14727928, -0.16261302, -0.06770234],
-"br":[0.11499645, 0.0824301, 0.07218035, 0.13258332, -0.13419574, -0.12916718, -0.12626709, -0.06741403, -0.02857593, -0.2543893, 0.02193022, 0.0760875, -0.1562702, 0.07712954, 0.3149361, 0.10217083, -0.041038, 0.16601022, 0.01906607, -0.02043359, 0.05471838, -0.15233372, -0.06945753, -0.02313732, -0.07342829, 0.3331809, -0.05246306, -0.0269839, -0.05435036, 0.13908924, 0.32604694, 0.00170966, 0.14997493, 0.13026518, -0.14908995, 0.12238151, -0.06773318, -0.13566032, 0.16068587, -0.12842499, 0.04970508, 0.17827724, 0.09729939, -0.00447832, -0.1739753, -0.16429187, 0.09886666, -0.08058207, 0.00714044, 0.04585538, 0.13424252, 0.11376464, -0.01675582, 0.17901348, -0.00653374, -0.01570439, 0.13032894, 0.01734108, 0.16833901, -0.1173776, 0.07662185, -0.15942436, -0.21173944, -0.10505079, 0.0597497, 0.03491669, 0.00338842, -0.04969047, -0.07644061, 0.04528612, 0.254365, -0.09514527, 0.12015092, 0.08262096, -0.02352029],
-"call":[0.10815012, -0.12419116, 0.18759736, -0.1905027, 0.01619313, -0.13483052, -0.12278763, -0.07051246, -0.0083437, 0.25107145, -0.16601063, 0.08127163, -0.17432374, -0.18380919, 0.24335551, 0.07208319, -0.04401246, 0.11606008, -0.02733191, 0.02098145, 0.019888, -0.13705409, -0.07569158, -0.03072285, 0.16870692, -0.09787013, -0.09340432, 0.01931342, -0.03557841, 0.14359893, -0.1592094, -0.00055867, 0.159316, 0.1099042, -0.11837319, 0.08741318, 0.03364393, -0.12831019, 0.10450637, -0.12699029, -0.20213994, 0.18390144, 0.11092624, -0.00209971, -0.13063665, 0.19996215, 0.09006448, -0.07840014, 0.22549215, 0.02587176, 0.13374338, 0.11009877, -0.01874998, -0.21446206, 0.02377797, -0.01036531, 0.05427047, 0.01418843, 0.00771817, -0.12639529, -0.10334941, -0.12244401, 0.30014148, -0.09857437, 0.21212636, 0.03429029, -0.04947309, 0.1023307, -0.07743628, 0.03006962, -0.24868701, -0.02357339, 0.11574048, 0.06895301, -0.363474],
-"constant":[-0.2850312, -0.22839, 0.12669143, -0.0674703, -0.12639391, -0.00477266, 0.04786542, 0.06336267, 0.08660185, 0.12805316, -0.07146342, 0.21539183, -0.0624397, -0.02638953, -0.28688517, 0.28374302, 0.05338082, 0.05559688, -0.13133128, 0.12440272, -0.03583231, 0.29848817, 0.13930812, 0.15453401, 0.0538353, -0.06874479, 0.00262802, 0.27964258, 0.19028014, -0.16371843, -0.05762961, 0.20059372, -0.20804578, -0.06549844, 0.09732475, -0.01551855, 0.21226783, 0.05889762, -0.07560658, 0.11312829, -0.04594622, -0.27309528, -0.05293005, 0.18953343, 0.05463868, -0.31045213, -0.04364616, 0.11005993, 0.12489324, -0.05413342, -0.05814561, -0.26131225, -0.18863814, 0.31165487, -0.08796364, -0.19958755, -0.10849535, 0.14899114, -0.01385941, 0.29359573, -0.01349372, 0.0562498, 0.10977754, 0.08993197, 0.06231657, -0.13509868, -0.20968516, 0.03578237, 0.15356435, -0.17766887, -0.11509016, 0.06898113, -0.06665445, -0.14065051, 0.34711906],
-"extractelement":[-1.62098512e-01, -2.00751424e-02, 1.71776459e-01, 1.38723463e-01, -1.60769299e-01, 9.30800363e-02, -1.24304645e-01, 1.45934001e-01, -5.04420400e-02, -7.54220188e-02, -5.30924797e-02, 8.55294541e-02, 1.17488159e-02, -1.82809234e-01, 9.37484950e-02, 8.38199258e-02, 1.26266479e-01, 9.31843743e-02, -2.26742271e-02, 6.50950940e-03, 5.02749532e-03, -1.14133619e-01, 1.80842891e-01, -9.63811725e-02, 1.67923152e-01, -5.84629439e-02, 1.04362026e-01, 8.92093012e-05, 7.93750137e-02, 1.39107972e-01, -8.40696543e-02, -1.59506593e-02, 1.99361086e-01, 1.93857521e-01, -1.46850392e-01, -1.78695560e-01, 9.88712162e-03, -1.40836969e-01, 1.77736521e-01, -1.61047727e-01, 3.51648539e-01, 1.79638579e-01, 1.49385989e-01, -6.06128015e-03, -1.72102928e-01, -1.81016047e-02, 1.01466164e-01, -1.28312945e-01, -8.05212185e-02, 6.28385469e-02, 1.15654446e-01, 1.91848025e-01, 3.03963851e-02, 3.55794169e-02, 1.40873834e-01, -1.44319711e-02, 1.85423180e-01, 1.16111919e-01, -7.47816712e-02, -1.14719503e-01, 1.02934733e-01, -1.83810964e-01, -2.64400076e-02, -8.99282843e-02, -1.90383971e-01, 7.31386840e-02, -4.36249487e-02, -4.71482053e-02, 1.07486300e-01, 1.09736443e-01, 4.15226035e-02, -1.42309800e-01, 8.96709636e-02, 6.64985999e-02, -6.13647103e-02],
-"extractvalue":[0.08820406, -0.0182279, 0.01493346, -0.17219813, 0.02927338, -0.17379265, -0.05663937, -0.06805898, -0.21235467, -0.01969833, -0.15152055, -0.18049374, 0.01062911, 0.07935719, -0.01993761, 0.12405304, -0.03198355, 0.13872959, 0.18017697, -0.0100...
[truncated]

@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-12-_ir2vec_scale_vocab branch from 8e397f7 to ac378a9 Compare June 12, 2025 23:34
Copy link
Contributor Author

@svkeerthy svkeerthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@albertcohen - Please have a look. I am not able to add you as reviewer.

@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-12-_ir2vec_scale_vocab branch from ac378a9 to 2657262 Compare June 13, 2025 00:01
"Unable to parse the vocabulary");
ir2vec::Vocab OpcodeVocab, TypeVocab, ArgVocab;
unsigned OpcodeDim, TypeDim, ArgDim;
if (auto Err = parseVocabSection("Opcodes", *ParsedVocabValue, OpcodeVocab,
Copy link
Member

@mtrofin mtrofin Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the format, best to also update the doc.

Also, this means the sections must all be present (in any order), even if empty, correct? SGTM, just something worth spelling out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Will put it in the doc.

@@ -104,7 +106,10 @@ MODULE_PASS("lower-ifunc", LowerIFuncPass())
MODULE_PASS("simplify-type-tests", SimplifyTypeTestsPass())
MODULE_PASS("lowertypetests", LowerTypeTestsPass())
MODULE_PASS("fatlto-cleanup", FatLtoCleanup())
MODULE_PASS("pgo-force-function-attrs", PGOForceFunctionAttrsPass(PGOOpt ? PGOOpt->ColdOptType : PGOOptions::ColdFuncOpt::Default))
MODULE_PASS("pgo-force-function-attrs",
PGOForceFunctionAttrsPass(PGOOpt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make the unrelated stylistic changes to this file in a separate patch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, will do. Missed the unrelated formatting changes.

@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-_mlininer_ir2vec_integrating_ir2vec_with_mlinliner branch from b7ec652 to d151083 Compare June 13, 2025 17:45
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-12-_ir2vec_scale_vocab branch from 2657262 to 730ab91 Compare June 13, 2025 17:46
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-_mlininer_ir2vec_integrating_ir2vec_with_mlinliner branch from d151083 to a2bec77 Compare June 13, 2025 18:18
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-12-_ir2vec_scale_vocab branch from 730ab91 to d31d756 Compare June 13, 2025 18:18
@svkeerthy svkeerthy requested a review from mtrofin June 13, 2025 18:20
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-10-_mlininer_ir2vec_integrating_ir2vec_with_mlinliner branch from a2bec77 to 9124e83 Compare June 17, 2025 18:00
@svkeerthy svkeerthy force-pushed the users/svkeerthy/06-12-_ir2vec_scale_vocab branch from d31d756 to 32d16aa Compare June 17, 2025 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants