{"id":1126006,"date":"2024-06-13T16:37:09","date_gmt":"2024-06-13T20:37:09","guid":{"rendered":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/uncategorized\/recurrent-quantum-embedding-neural-network-and-its-application-in-vulnerability-detection-scientific-reports-nature-com\/"},"modified":"2024-06-13T16:37:09","modified_gmt":"2024-06-13T20:37:09","slug":"recurrent-quantum-embedding-neural-network-and-its-application-in-vulnerability-detection-scientific-reports-nature-com","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/quantum-computing\/recurrent-quantum-embedding-neural-network-and-its-application-in-vulnerability-detection-scientific-reports-nature-com\/","title":{"rendered":"Recurrent quantum embedding neural network and its application in vulnerability detection | Scientific Reports &#8211; Nature.com"},"content":{"rendered":"<p><p>    In this section, we first introduce the composition and    principles of the important components of RQENN including the    trainable encoding method based on parameterized binary index    and recurrent cell. Then we introduce the RQENN-based    classification model. Finally, we present the task flow of    applying RQENN classification model to vulnerability detection.  <\/p>\n<p>    Classical neural network models for processing NLP tasks first    need to tokenize the text and build a word dictionary,    according to which the text is converted into a digital index    sequence of words. Each digital index corresponds to a one-hot    vector, which are transformed into dense vectors by word    embedding methods involved in the network training to obtain a    more accurate vector representation. However, similar methods    migrated to QNN do not work. Specifically, in the classical    model, the one-hot vectors are sparse and orthogonal, which    means that when word embedding is performed, each word gets    only some of the weights from the embedding weight matrix    (W) as the representation    vector. This process can be viewed as using the one-hot vector    as the key to query the corresponding value in the weight map    (W), as shown in    Fig.1a. Thus, in the case    of random initialization of weights, the initial representation    vectors of all words are uncorrelated, and they establish    lexical connections as the training process proceeds. However,    in the quantum model, due to the properties of quantum    superposition and entanglement, the quantum state obtained from    encoded words (e.g., (|{psi    }_{1}rangle) in Fig.1b) is difficult to be    sparse and orthogonal as the classical one-hot vectors. This    implies that the initially encoded quantum state of each word    has some kind of connection, and the use of this quantum state    as the \"key\" inevitably leads to the \"value\" obtained from the    query being related to all the elements in the unitary matrix,    which contains various non-semantic connections. The use of    this quantum state as the \"key\" inevitably leads to a query    that yields a \"value\" that is related to all the elements of    the Missy's orthogonal weight matrix, and the results obtained    contain various non-semantic connections. This prevents QNN    from learning the semantics of words through the quantum    embedding method.  <\/p>\n<p>            (a) The process of token 'NULL' being encoded            into a quantum circuit. The (|{psi }_{1}rangle) obtained            after rotation input layer is treated as a quantum            one-hot vector. It further applies QEmbedding to obtain            (|{psi            }_{2}rangle) which can be treated as a quantum            dense vector used as token representation. (b)            Classical word embedding computation process. The            one-hot vector is treated as a key to query value in            the map of weights.          <\/p>\n<p>    To address this problem, we propose a trainable encoding method    based on parameterized binary index to encode code tokens as    quantum state data and efficiently learn the semantics of the    tokens. The specific steps are as follows:  <\/p>\n<p>    Step I: Tokenize source code into tokens to create a dictionary    and then tokens are mapped to numeric indexes.  <\/p>\n<p>    Step II: Convert the numeric indexes from decimal to binary    representation. For a dictionary containing (N) words, an index is represented by an    (n=lceil lo{g}_{2}Nrceil)    bits binary numbers.  <\/p>\n<p>    Step III: Replace \"0\" and \"1\" in the binary number indexes with    the trainable parameters \"({theta    }_{0})\" and \"({theta    }_{1})\", forming parameterized binary indexes.  <\/p>\n<p>    Step IV: Encode parameterized binary index using (n=lceil lo{g}_{2}Nrceil) qubits. Each    bit of the input is encoded to the corresponding qubit through    an Ry rotation gate.  <\/p>\n<p>    Step V: Add a trainable layer containing parameters to the    quantum circuit as a quantum embedding (QEmbedding)    implementation.  <\/p>\n<p>    The Ry rotation input layer and the QEmbedding layer in the    above steps together form the trainable encoding layer.  <\/p>\n<p>    As a simple example, for the following source code training    set:  <\/p>\n<p>      $$left[ {text{``}{text{VAR1 }} =      {text{ NULL''}}, , text{``}{text{VAR2 }} = {text{      NULL''}}, , text{``}{text{VAR1 }} = {text{ VAR2''}}}      right],$$    <\/p>\n<p>    we can build such a dictionary to map all code tokens to    numeric indexes:  <\/p>\n<p>      $${ text{`}= text{'} : ,      0,text{`}{text{VAR1'}} :{ 1},text{`}{text{NULL'}} :{      2},text{`}{text{VAR2'}} :{ 3}} .$$    <\/p>\n<p>    These numeric indexes are further converted to parameterized    binary indexes:  <\/p>\n<p>      $${ text{`}= text{'} : ,      {theta }_{0}{theta }_{0},text{`}{text{VAR1'}} :{ {theta      }_{0}{theta }_{1}},text{`}{text{NULL'}} :{{theta      }_{1}{theta }_{0}},text{`}{text{VAR2'}} :{ {theta      }_{1}{theta }_{1}}} .$$    <\/p>\n<p>    Next, we determine the angles of the Ry gates and construct the    quantum circuit based on the parameterized binary index of the    word to be encoded. Taking encoding token 'NULL' as an example,    as shown in Fig.1b, its corresponding    indexes ({theta }_{1}{theta    }_{0}) are encoded bit-by-bit to a circuit with 2    qubits as the Ry gates angles. Then a QEmbedding layer is    further added to jointly form the quantum trainable encoding    circuits.  <\/p>\n<p>    As Eqs.(13) shown below.    (|{psi }_{0}rangle) is the    initial state. The quantum circuit encodes the index    ({theta }_{1}{theta }_{0})    into the quantum state (|{psi    }_{1}rangle) by rotation input layer. It is a    ({2}^{n})-dimensional    vector. It has (N) different    cases, corresponding to (N)    possible combinations of the input rotation layer parameters.    The parameters \"({theta    }_{0})\" and \"({theta    }_{1})\" are involved in the training process of QNN to    eliminate possible inherent connections, so that (|{psi }_{1}rangle) has the same    function as the classic one-hot vector. The one-hot vector is a    form transformed by symbols that is easy to use by the classic    network model, and the obtained (|{psi }_{1}rangle) is a form    transformed by symbols that is easy to use by QNN. This is the    unique aspect of trainable encoding based on parameterized    binary index and the key to improving model performance. Next,    (|{psi }_{1}rangle) learns    lexical connections between encoded words through a trainable    QEmbedding layer ({U}_{qe}({{varvec{theta}}}_{qe})),    which is similar to the classic word embedding principle. The    obtained quantum state (|{psi    }_{2}rangle) is described in Eq.(3), where ({U}_{qe}({{varvec{theta}}}_{qe})={[begin{array}{cccc}{{varvec{u}}}_{0}^{dagger}&    {{varvec{u}}}_{1}^{dagger}& {{varvec{u}}}_{2}^{dagger}&    {{varvec{u}}}_{3}^{dagger}end{array}]}^{dagger}).    At this point, the ({2}^{n})-dimensional dense vectors    corresponding to (|{psi    }_{2}rangle) are the representations of the words,    except that the words are converted from indexes to    quantum-friendly quantum state representations instead of    classical vector representations.  <\/p>\n<p>      $$|{psi }_{0}rangle      ={[begin{array}{cccc}{varepsilon }_{0}& {varepsilon }_{1}&      {varepsilon }_{2}& {varepsilon      }_{3}end{array}]}^{dagger}$$    <\/p>\n<p>      (1)    <\/p>\n<p>      $$left|{psi }_{1}right.rangle      =left{begin{array}{c}Ryleft({theta }_{0}right)otimes      Ryleft({theta }_{0}right)left|{psi }_{0}right.rangle      ,,, index={theta }_{0}{theta }_{0}\\ Ryleft({theta      }_{0}right)otimes Ryleft({theta }_{1}right)left|{psi      }_{0}right.rangle ,,, index={theta }_{0}{theta }_{1}\\      Ryleft({theta }_{1}right)otimes Ryleft({theta      }_{0}right)left|{psi }_{0}right.rangle ,,,      index={theta }_{1}{theta }_{0}\\ Ryleft({theta      }_{1}right)otimes Ryleft({theta }_{1}right)left|{psi      }_{0}right.rangle , ,, index={theta }_{1}{theta      }_{1}end{array}right.$$    <\/p>\n<p>      (2)    <\/p>\n<p>      $$|{psi }_{2}rangle      ={U}_{qe}({{varvec{theta}}}_{qe})|{psi }_{1}rangle      ={[begin{array}{cccc}{{varvec{u}}}_{0}^{dagger}|{psi      }_{1}rangle & {{varvec{u}}}_{1}^{dagger}|{psi      }_{1}rangle & {{varvec{u}}}_{2}^{dagger}|{psi      }_{1}rangle & {{varvec{u}}}_{3}^{dagger}|{psi      }_{1}rangle end{array}]}^{dagger}$$    <\/p>\n<p>      (3)    <\/p>\n<p>    Compared with the trainable encoding method based on    parameterized binary index, if the binary index obtained in    Step II is used for encoding, the fixed angle of the rotation    gate (0 or 1) will result in constant non-lexical connections    between (|{psi    }_{1}rangle) of different words. These connections are    brought into training process of the quantum word embedding    layer, possibly affecting the normal learning of lexical    connections of the code tokens. In fact, it can also make the    (N) quantum states    (|{psi }_{1}rangle)    orthogonal to each other like classical one-hot vectors by    choosing a suitable fixed angle of the rotation gate, this    method is called the \"orthogonal method\". It determines the    specific angles \"({theta    }_{0})\" and \"({theta    }_{1})\" to be used for replacing the binary \"0\" and \"1\"    before training. By respectively applying (N) different rotation layers with    angles \"({theta }_{0})\" and    \"({theta }_{1})\" on    (N) independent quantum    circuits, we can obtain (N)    quantum states. We use the gradient descent algorithm to    minimize the sum of the absolute values of the two-by-two inner    products of these (N)    quantum states under random initialization of the quantum    initial states. This approach references the property of mutual    orthogonality between one-hot vectors, which ultimately yields    ({theta }_{0}=-frac{pi    }{2}) and ({theta    }_{1}=frac{pi }{2}), and encoding using this value    will make (|{psi    }_{1}rangle) as orthogonal as possible for different    tokens. But there are also differences between QEmbedding and    classical embedding. Each element of the weight matrix in    classical embedding is a trainable parameter, while QEmbedding    only controls the changes of the matrix through a small number    of parameter-containing unitary gates. It cannot be proved that    the always orthogonal (|{psi    }_{1}rangle) is more helpful for learning (|{psi }_{2}rangle). Therefore, in    this paper, we add \"({theta    }_{0})\" and \"({theta    }_{1})\" as trainable parameters to the learning process    of RQENN, which is the reason for using parameterized binary    indexes in Step III. We will show in the Results section the    performance of the model when using trainable encoding based on    binary index, orthogonality method, and parameterized binary    index as data inputs, further demonstrating the effectiveness    of the proposed methods.  <\/p>\n<p>    The trainable encoding method defined in the above section is a    crucial component in the construction of our recurrent quantum    embedding neural network cell. Much like classical RNNs, we    define such a cell that will be successively applied to the    input presented to the network for capturing contextual    connections in the code. More specifically, the cell is    comprised of a trainable encoding stage and a working stage,    which are used to learn the semantics of input tokens and    memorize contextual dependencies, respectively. This cell is    applied iteratively in RQENN, and its internal state is passed    on to the next iteration of the network. RQENN cells at all    time steps share the same trainable parameters.  <\/p>\n<p>    Figure2 illustrates the RQENN    cell, which learns the quantum word embedding of the current    time step input ({{varvec{x}}}_{t}=({x}_{{t}_{0}},...,{x}_{{t}_{n}}))    in the encoding stage and combines it with the cell input    hidden state (|{psi    }_{t-1}rangle) in the work stage to learn the mapping    relation from this combined state to the cell output hidden    state (|{psi }_{t}rangle).    The equation for this process is as follow:  <\/p>\n<p>      $$|{psi }_{t}rangle      ={U}_{qnn}{U}_{qe}{U}_{in}({{varvec{x}}}_{t})|{psi      }_{t-1}rangle$$    <\/p>\n<p>      (4)    <\/p>\n<p>    where ({U}_{in}),    ({U}_{qe}) and ({U}_{qnn}) denote the unitary matrix of    the rotation input layer, QEmbedding layer and quantum weight    (QWeight) layer, respectively.  <\/p>\n<p>            Recurrent quantum embedding neural network cell. It            consists of a trainable encoding stage and a QNN work            stage, where the principle of the encoding stage is as            described in the above section. It transforms the            internal state in into the state out at each time step            and iterates this process.          <\/p>\n<p>    The encoding stage uses the trainable encoding method described    above. In the rotation input layer, an Ry gate is applied on    the ith qubit to rotate the angle to the ith value of the    parameterized binary index. In the QEmbedding layer, a    (m) layer ansatz composed of    alternate rotation layer and entanglement layer is used to    learn the quantum word embedding representation. Each layer of    the ansatz consists of 2 rotation layers and 2 entanglement    layers consisting of staggered entanglements between adjacent    qubits. In the working stage, a (n) layer one-dimensional alternating    layered Hardware Efficient Ansatz48,49 was used to    build the QWeight layer. This ansatz is implemented by    sequentially applying a two-qubit unitary to adjacent qubits.    Each two-qubit unitary entangles the last qubit obtained from a    previous unitary with the next one. The unique recurrent    circuit cell architecture with scalable layer in multi-stage is    the key to improving model performance. This two-qubit unitary    consists of two Ry gates and a Cnot gate that have been proven    effective50, and its unitary    transformation is described by Eq. (5). We show below    the specific implementations of the different network layers by    equations. Equation (6) shows the unitary    transformation of the rotation input layer, where (tin {1,...,T}) represents the time    step. A token is input into the network at each time step, and    (T) is set as the total code    length. Equations (7, 8) and Eq.    (9) are the unitary    transformations of the QEmbedding layer and QWeight layer    respectively.  <\/p>\n<p>      $${U}_{l,i}^{left[2right]}left({{varvec{theta}}}_{l,i}right)=Cno{t}_{i,i+1}{otimes      }_{j=0}^{1}R{y}_{i+j}left({theta }_{l,i,j}right), lin      left{text{0,1}right} and iin left{0,dots      ,n-2right}$$    <\/p>\n<p>      (5)    <\/p>\n<p>      $${U}_{in}left({{varvec{x}}}_{t}right)={otimes      }_{i=0}^{n}left(R{y}_{i}left({x}_{{t}_{i}}right)right),      {x}_{{t}_{i}}in left{{theta }_{0},{theta }_{1}right}      and tin {1,...,T}$$    <\/p>\n<p>      (6)    <\/p>\n<p>      $${U}_{q{e}_{l}}({{varvec{theta}}}_{q{e}_{l}})=prod_{i=1}^{lfloor      (n-1)\/2rfloor }Cno{t}_{2i-text{1,2}i}{otimes      }_{i=0}^{n-1}R{y}_{i}({theta }_{l,n+i})prod_{i=1}^{lfloor      n\/2rfloor }Cno{t}_{2i-text{2,2}i-1}{otimes      }_{i=0}^{n-1}R{y}_{i}({theta }_{l,i})$$    <\/p>\n<p>      (7)    <\/p>\n<p>      $${U}_{qe}left({{varvec{theta}}}_{qe}right)={U}_{q{e}_{1}}left({{varvec{theta}}}_{q{e}_{1}}right){U}_{q{e}_{0}}left({{varvec{theta}}}_{q{e}_{0}}right)$$    <\/p>\n<p>      (8)    <\/p>\n<p>      $${U}_{qnn}left({{varvec{theta}}}_{qnn}right)={U}_{1,n-2}^{left[2right]}left({{varvec{theta}}}_{1,n-2}right)dots      {U}_{text{1,0}}^{left[2right]}left({{varvec{theta}}}_{text{1,0}}right){U}_{0,n-2}^{left[2right]}left({{varvec{theta}}}_{0,n-2}right)dots      {U}_{text{0,0}}^{left[2right]}left({{varvec{theta}}}_{text{0,0}}right)$$    <\/p>\n<p>      (9)    <\/p>\n<p>    We use RQENN cell to build classifiers applied to vulnerability    detection. Similar to RNN, RQENN initializes the hidden state    at (t=0) by adding a layer    of Hadamard gates initially, and then the RQENN cell is    iteratively applied to a sequence of the input source code    ({{varvec{x}}}_{1},{{varvec{x}}}_{2},...,{{varvec{x}}}_{T})    as shown in Fig.3 to capture the    contextual connections in the source code. The entire model    also includes measuring the expectation value of a single qubit    for the last two qubits. This expectation value is described as    Eq.(10):  <\/p>\n<p>      $${E}_{i}left({varvec{X}},{varvec{Theta}}right)=langle      {0}^{otimes n}|{H}^{daggerotimes      n}{U}_{QC}^{dagger}left({varvec{X}},{varvec{Theta}}right){widehat{M}}_{i}{U}_{QC}({varvec{X}},{varvec{Theta}}){H}^{otimes      n}|{0}^{otimes n}rangle ,hspace{0.5em}hspace{0.5em}iin      {n-1,n-2}$$    <\/p>\n<p>      (10)    <\/p>\n<p>    where ({U}_{QC}({varvec{X}},{varvec{Theta}})={U}_{cell}({{varvec{x}}}_{1},{varvec{Theta}})...{U}_{cell}({{varvec{x}}}_{T},{varvec{Theta}}))    is the a quantum circuit composed of all cells, and    ({U}_{cell}({{varvec{x}}}_{t},{varvec{Theta}})={U}_{qnn}({{varvec{theta}}}_{qnn}){U}_{qe}({{varvec{theta}}}_{qe}){U}_{in}({{varvec{x}}}_{t},{{varvec{theta}}}_{in})).    ({varvec{Theta}}) is the    parameter set of the cell, and ({varvec{X}}=[{{{varvec{x}}}_{1},dots    ,{varvec{x}}}_{T}]) is the input index sequence.    ({widehat{M}}_{i}) is the    operator used to calculate the expectation of the ith qubit,    i.e.  <\/p>\n<p>            RQENN classifier. The model is built by iteratively            applying the same RQENN cell to the input code token            sequence. Measurements are performed on the last two            qubits separately to obtain the expectation values as            classification logits.          <\/p>\n<p>      $${widehat{M}}_{i}=left{begin{array}{c}Iotimes      Iotimes ...otimes {sigma }_{z}otimes      I,hspace{0.5em}i=n-2\\ Iotimes Iotimes ...otimes Iotimes      {sigma }_{z},hspace{0.5em}i=n-1end{array}right.$$    <\/p>\n<p>      (11)    <\/p>\n<p>    The two calculated expectation values are used to determine the    data category by comparing the numerical magnitudes, and we use    them as logits to calculate the cross-entropy loss function for    classification.  <\/p>\n<p>    The goal of our vulnerability detection is to detect whether a    program's source code may contain vulnerabilities using the    RQENN classifier. In this paper, we perform the vulnerability    detection task using the pipeline shown in Fig. 4,    which consists of the following three steps:  <\/p>\n<p>            Vulnerability detection task flow. We extract            normalized labeled code gadgets from the source code as            training data and then generate parameterized binary            indexes from them, which are fed into the RQENN            classifier. After training, the model can detect the            presence of vulnerabilities in the source code.          <\/p>\n<p>    Step I: Generating normalized code gadgets and labels from    source code. First, we extract the data dependency graph (DDG)    of the code using the open source code analysis tool Joern.    Next, we extract labeled code gadgets based on manually defined    vulnerability features. Specifically, we locate the node    containing the vulnerable library function\/API call in the    extracted DDG, such as the \"strcat\" function shown in left side    of Fig.4, and slice the code    into small pieces according to the connection to the node. The    types of API calls are categorized into forward (e.g., the    \"recv\" function) and backward API calls (e.g., the \"strcat\"    function here) according to whether or not they take external    input from a socket, and forward slices and backward slices are    generated accordingly. The forward slices obtain the set of    statements of the nodes in the DDG that are recursively pointed    forward from the API node, and the backward slices obtain the    set of statements of the nodes in the DDG that are recursively    pointed to the API node. These slices are code gadgets, which    are labeled '0' or '1' depending on whether they contain    vulnerabilities or not. In the next step we normalized the code    gadget. The processing methods include removing comments and    strings, normalizing user-defined variable names ('VAR1' etc.)    and function names ('FUN1' etc.). Finally, the normalized    labeled code gadget is obtained.  <\/p>\n<p>    Step II: The normalized labeled code gadgets are treated as    text data from which parameterized binary index sequences are    generated. First, we preprocess the data set, clean the    original text, remove punctuation marks and non-ascii    characters, etc. Then we split and pad the preprocessed text    and build a dictionary, which is converted into a parameterized    binary index dictionary according to the method mentioned    before. Finally, the token sequences after tokenization are    converted into parameterized binary index sequences according    to the dictionary.  <\/p>\n<p>    Step III: Training and evaluating RQENN Models. We input the    sequence of parameterized binary indexes into the RQENN model    in order, execute the quantum circuit on the simulator or a    real machine, and complete the training and validation of the    RQENN model according to the quantum circuit learning    framework18. The model can    detect the presence of vulnerabilities in the source code.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read the rest here:<\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/www.nature.com\/articles\/s41598-024-63021-y\" title=\"Recurrent quantum embedding neural network and its application in vulnerability detection | Scientific Reports - Nature.com\">Recurrent quantum embedding neural network and its application in vulnerability detection | Scientific Reports - Nature.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> In this section, we first introduce the composition and principles of the important components of RQENN including the trainable encoding method based on parameterized binary index and recurrent cell.  <a href=\"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/quantum-computing\/recurrent-quantum-embedding-neural-network-and-its-application-in-vulnerability-detection-scientific-reports-nature-com\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[257742],"tags":[],"class_list":["post-1126006","post","type-post","status-publish","format-standard","hentry","category-quantum-computing"],"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/1126006"}],"collection":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/comments?post=1126006"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/posts\/1126006\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/media?parent=1126006"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/categories?post=1126006"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/prometheism-transhumanism-posthumanism\/wp-json\/wp\/v2\/tags?post=1126006"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}