Using Photonic Neurons to Improve Neural Networks – RTInsights

Photonic neural networks represent a promising technology that could revolutionize the way businesses approach machine learning and artificial intelligence systems.

Researchers at Politecnico di Milano earlier this year announced a breakthrough in photonic neural networks. They developed training strategies for photonic neurons similar to those used for conventional neural networks. This means that the photonic brain can learn quickly and accurately and achieve precision comparable to that of a traditional neural network but with considerable energy savings.

Neural networks are a type of technology inspired by the way the human brain works. Developers can use them in machine learning and artificial intelligence systems to mimic human decision making. Neural networks analyze data and adapt their own behavior based on past experiencesmaking them useful for a wide range of applicationsbut they also require a lot of energy to train and deploy. This makes them costly and inefficient for the typical company to integrate into operations.

See also: MIT Scientists Attempt To Make Neural Networks More Efficient

To solve this obstacle, the Politecnico di Milano team has been working on developing photonic circuits, which are highly energy-efficient and can be used to build photonic neural networks. These networks use light to perform calculations quickly and efficiently, and their energy consumption grows much more slowly than traditional neural networks.

According to the team, the photonic accelerator in the chip allows calculations to be carried out very quickly and efficiently using a programmable grid of silicon interferometers. The calculation time is equal to the transit time of light in a chip a few millimeters in size, which is less than a billionth of a second. The work done was presented in a paper published in Science.

See also: Charting a New Course of Neural Networks with Transformers

This breakthrough has important implications for the development of artificial intelligence and quantum applications. The photonic neural network can also be used as a computing unit for multiple applications where high computational efficiency is required, such as graphics accelerators, mathematical coprocessors, data mining, cryptography, and quantum computers.

Photonic neural networks represent a promising technology that could revolutionize the way we approach machine learning and artificial intelligence systems. Their energy efficiency, speed, and accuracy make them a powerful tool for a wide range of applications, with much potential for a variety of industries seeking digital transformation and AI integrations.

Read the rest here:

Using Photonic Neurons to Improve Neural Networks - RTInsights

The Evolution of Artificial Intelligence: From Turing to Neural Networks – Fagen wasanni

AI, or artificial intelligence, has become a buzzword in recent years, but its roots can be traced back to the 20th century. While many credit OpenAIs ChatGPT as the catalyst for AIs popularity in 2022, the concept has been in development for much longer.

The foundational idea of AI can be attributed to Alan Turing, a mathematician famous for his work during World War II. In his paper Computing Machinery and Intelligence, Turing posed the question, Can machines think? He introduced the concept of The Imitation Game, where a machine attempts to deceive an interrogator into thinking it is human.

However, it was Frank Rosenblatt who made the first significant strides in AI implementation with the creation of the Perceptron in the late 1950s. The Perceptron was a computer modeled after the neural network structure of the human brain. It could teach itself new skills through iterative learning processes.

Despite Rosenblatts advancements, AI research dwindled due to limited computing power and the simplicity of the Perceptrons neural network. It wasnt until the 1980s that Geoffrey Hinton, along with researchers like Yann LeCun and Yoshua Bengio, reintroduced the concept of neural networks with multiple layers and numerous connections to enable machine learning.

Throughout the 1990s and 2000s, researchers further explored the potential of neural networks. Advances in computing power eventually paved the way for machine learning to take off around 2012. This breakthrough led to the practical application of AI in various fields, such as smart assistants and self-driving cars.

In late 2022, OpenAIs ChatGPT brought AI into the spotlight, showcasing its capabilities to professionals and the general public alike. Since then, AI has continued to evolve, and its future remains uncertain.

To better understand and navigate the world of AI, Lifehacker provides a collection of articles that cover various aspects of living with AI. These articles include tips on identifying when AI is deceiving you, an AI glossary, discussions on fictional AI, and practical uses for AI-powered applications.

As AI continues to shape our world, it is essential to stay informed and prepared for the advancements and challenges it brings.

See original here:

The Evolution of Artificial Intelligence: From Turing to Neural Networks - Fagen wasanni

Types of Neural Networks in Artificial Intelligence – Fagen wasanni

Neural networks are virtual brains for computers that learn by example and make decisions based on patterns. They process large amounts of data to solve complex tasks like image recognition and speech understanding. Each neuron in the network connects to others, forming layers that analyze and transform the data. With continuous learning, neural networks become better at their tasks. From voice assistants to self-driving cars, neural networks power various AI applications and revolutionize technology by mimicking the human brain.

There are different types of neural networks used in artificial intelligence, suited for specific problems and tasks. Feedforward Neural Networks are the simplest type, where data flows in one direction from input to output. They are used for tasks like pattern recognition and classification. Convolutional Neural Networks process visual data like images and videos, utilizing convolutional layers to detect and learn features. They excel in image classification, object detection, and image segmentation.

Recurrent Neural Networks handle sequential data by introducing feedback loops, making them ideal for tasks involving time-series data and language processing. Long Short-Term Memory Networks are a specialized type of RNN that capture long-range dependencies in sequential data. They are beneficial in machine translation and sentiment analysis.

Generative Adversarial Networks consist of two networks competing against each other. The generator generates synthetic data, while the discriminator differentiates between real and fake data. GANs are useful in image and video synthesis, creating realistic images, and generating art.

Autoencoders aim to recreate input data at the output layer, compressing information into a lower-dimensional representation. They are used for tasks like dimensionality reduction and anomaly detection.

Transformer Networks are popular in natural language processing. They use self-attention mechanisms to process sequences of data, capturing word dependencies efficiently. Transformer networks are pivotal in machine translation, language generation, and text summarization.

These examples represent the diverse range of neural network types. The field of artificial intelligence continuously evolves with new architectures and techniques. Choosing the appropriate network depends on the specific problem and data characteristics.

Continue reading here:

Types of Neural Networks in Artificial Intelligence - Fagen wasanni

The Future of Telecommunications: 3D Printing, Neural Networks … – Fagen wasanni

Exploring the Future of Telecommunications: The Impact of 3D Printing, Neural Networks, and Natural Language Processing

The future of telecommunications is poised to be revolutionized by the advent of three groundbreaking technologies: 3D printing, neural networks, and natural language processing. These technologies are set to redefine the way we communicate, interact, and exchange information, thereby transforming the telecommunications landscape.

3D printing, also known as additive manufacturing, is a technology that creates three-dimensional objects from a digital file. In the telecommunications industry, 3D printing has the potential to drastically reduce the time and cost associated with the production of telecom equipment. For instance, antennas, which are crucial components of telecom infrastructure, can be 3D printed in a fraction of the time and cost it takes to manufacture them traditionally. Moreover, 3D printing allows for the creation of complex shapes and structures that are otherwise difficult to produce, thereby enabling the development of more efficient and effective telecom equipment.

Transitioning to the realm of artificial intelligence, neural networks are computing systems inspired by the human brains biological neural networks. These systems learn from experience and improve their performance over time, making them ideal for tasks that require pattern recognition and decision-making. In telecommunications, neural networks can be used to optimize network performance, predict network failures, and enhance cybersecurity. For example, a neural network can analyze network traffic patterns to identify potential bottlenecks and suggest solutions to prevent network congestion. Similarly, it can detect unusual network activity that may indicate a cyber attack and take appropriate measures to mitigate the threat.

Lastly, natural language processing (NLP), a subfield of artificial intelligence, involves the interaction between computers and human language. NLP enables computers to understand, interpret, and generate human language, making it possible for us to communicate with computers in a more natural and intuitive way. In telecommunications, NLP can be used to improve customer service, automate routine tasks, and provide personalized experiences. For instance, telecom companies can use NLP to develop chatbots that can understand customer queries, provide relevant information, and even resolve issues without human intervention. Furthermore, NLP can analyze customer feedback to identify common issues and trends, helping telecom companies to better understand their customers and improve their services.

In conclusion, 3D printing, neural networks, and natural language processing are set to revolutionize the telecommunications industry. These technologies offer numerous benefits, including cost reduction, performance optimization, and improved customer service. However, their adoption also presents challenges, such as the need for new skills and the potential for job displacement. Therefore, as we move towards this exciting future, it is crucial for telecom companies, policymakers, and society at large to carefully consider these implications and take appropriate measures to ensure that the benefits of these technologies are realized while minimizing their potential drawbacks. The future of telecommunications is undoubtedly bright, and with the right approach, we can harness the power of these technologies to create a more connected and efficient world.

View post:

The Future of Telecommunications: 3D Printing, Neural Networks ... - Fagen wasanni

Portrait of intense communications within microfluidic neural … – Nature.com

Construction of in vitro neural networks (NN)

The topology for the microfluidic NNs was designed as a dual-compartment architecture separated by microchannels and a middle chamber, as described in Fig.1a and b. The microfluidic design includes large channels (teal area) on both sides of the microfluidic circuit, which are for seeding somas. Physical barriers prevent the somas from migrating outside these large chambers. However, the 5-m-tall microchannels and a middle chamber (red area) enable neurites to spread and connect the fluidic compartments along defined pathways. Because of the enhanced growth kinetics of the axons, long, straight microchannels (>500m in length) are expected to favor them and to prevent dendrites from connecting distant populations.

Figure1c illustrates the possible neurite guidance and connection schemes. From left to right, the first and shortest microchannels should favor neurite outgrowth from the somatic to the synaptic chamber. From there, dendrites are expected to spread over this 3-mm-wide middle chamber, while the axons, in contrast, may grow straight ahead toward the opposite channels or turn back toward the somatic chamber. At one entrance of the long axon microchannels, short dead-end microchannels should prevent an axonal closed loop, which would lock axons into the long microchannel. Those traps should guide the axons toward the short microchannel and the somatic chamber. The last schematic illustrates a simple, inexhaustive list of examples of connectivity that may result from these guiding rules in the cases of one or two nodes located in a somatic chamber. Active and hidden nodes (blue and gray circles, respectively) can both be involved.

The microfluidic circuits are then assembled with electronic chips on which microelectrode arrays are accurately aligned with the fluidic compartments and microchannels (Fig.2). Thus, several recording devices can efficiently track spike propagation within the neurites while simultaneously monitoring soma activation.

Optical and fluorescent micrographs of random and microfluidic networks showing the homogeneous distribution of somas within the random area of both control (a) and microfluidic (b,c) samples and the wide exploration of neurites within all fluidic compartments, including the somatic chamber (c), the microchannels and the synaptic chamber (df). Immunofluorescence staining was performed after 14days in culture. DAPI, anti-synapsin, and anti-tubulin (YL1/2) were chosen as markers for labeling the cell nuclei, synapses and cytoskeleton, respectively.

For both growth conditions, primary cells extracted from hippocampal neurons were seeded on poly-l-lysine-coated microelectrode arrays and cultured in glial-conditioned media (same culture for both conditions). Thus, the substrate properties and culture conditions remained the same for the two batches of samples (details in Materials and methods). In the somatic chamber, neurons were well dispersed, and neurites homogeneously covered the underlying substrate surface, forming a highly entangled mesh (Fig.2b). Additionally, the synaptic chamber was widely explored by the neurites (Fig.2d), confirming their efficient spreading within the short microchannels as well as the efficient filtering of somas (Fig.2e). Figure2f gives a closer view of the junction with the synaptic chamber. The intricate entanglement of neurites and their proximity within the microchannels is expected to reinforce the neurite coupling efficiency and the networks modularity. These first results assessed the healthy and efficient outgrowth of neurons in the microfluidic compartments, which succeeded to provide the expected network structure, mainly by keeping the soma and neurite compartments in the desired location.

Figure3 shows the representative activity recorded within the random and organized networks on Day 6 in vitro (DIV6). As clearly observed, the number of active electrodes and the spike rate are significantly higher in the organized microfluidic NN (Fig.3a and b). Additionally, the number of isolated spikes as opposed to burst events was higher than that in controls (Fig.3c). Thus, the modularity of microfluidic NNs appears enhanced within the microfluidic network (dual-compartment configuration shown in Fig. S1).

Activity patterns of random and organized NNs. Comparison of the neuronal activity of cultured hippocampal neurons cultured in random configuration (left column) and on a microfluidic chip (right column). Recordings were acquired 6days after seeding (6days in vitro). (a) Typical 50s time course of one recording channel of the MEA within the control random sample (left) and inside an axonal microchannel (right). (b) Raster plots of all events crossing the negative threshold of 5 mean absolute deviations for the 64 recording channels of the MEA in the control and microfluidic conditions (left and right resp.). Red dots highlight examples of collective bursts. (c) Evolution of neural activity during the culture time for random (blue) and organized (red) NNs, in terms of the following (from left to right): mean spike rate per active electrode (min 0.1Hz mean firing rate), number of active electrodes, mean burst rate and burst duration. The mean spike and burst rates are extracted from the voltage traces for each recording channel and averaged among all active electrodes (60 electrodes total, same culture for all conditions). Statistical significance ***p<0.001 (Students t test).

Note that the electrodes located within the microchannels are expected to have a high sealing resistance because the channel cross section is small and filled with cellular material. As a result, the detection efficiency of such electrodes is believed to be increased compared to that of their synaptic and somatic chamber counterparts44. This effect related only to the measurement condition could artificially increase the activity level observed in microfluidic NNs. However, the spiking rate measured in the synaptic chamber did not follow that trend. While this compartment was similar to the somatic chamber in terms of growth conditions, the spiking rate was significantly higher, being rather comparable to that of the microchannels. Thus, the recording conditions could not explain the higher electrical activity. The electrical activity was enhanced independently of the MEAs detection efficiency, revealing the impact of the NN structure on the cell activity and the discrepancy in the spiking dynamics of the soma and the neurite.

The mid-term evolution of the electrical activity remained the same for both conditions, with all electrophysiological features globally increasing over time up to Day 15 (Fig.3c). Interestingly, the maximal number of active electrodes was reached earlier for the confined microfluidic NN (i.e. 4 days earlier than for the open NN, Fig. S2). Additionally, the number of active electrodes was significantly higher, in agreement with the raster plots (Fig.3b). Thus, more electrodes were active, and their activation occurred earlier in cell development. The confinement and geometrical constraints of the microfluidic environment reinforce the establishment of electrical activity, which agrees with the accelerated maturation of neuronal cells previously observed by immunohistochemistry within a similar microfluidic chip24.

The evolution of the burst rate followed a similar trend, increasing up to Day 14. Values ranged from 24 to 34Hz for the microfluidic networks, greatly exceeding the bursting rate of random NN (10 times higher). The burst duration was, however, similar for control and microfluidic networks, slightly increasing with the culture time (from 50 to 250ms) and as expected for hippocampal neurons4, confirming the reliability of the microfluidic NNs.

Neurite compartments exhibited dense activity patterns compared to the somatic chamber, with the highest spiking rates being located within the proximal compartments that were the closest to the somatic chamber (Fig.4). Within these short microchannels, spike patterns were characterized by the highest spike amplitude and shape variability. This variability remained within the synaptic chamber, but spike amplitudes were lowered. In those short and synaptic compartments, both dendrites and axons can be expected. However, in the distal and long microchannels, spike amplitude and shape were almost perfectly constant, which is as expected for action potentials carried by axons. These discrepancies were observed under the same growth conditions, all within the microchannels, and stem from the physiological properties of neurites.

Spike forms acquired in each microfluidic compartment. Data are sourced from the same recording at DIV 11, with the 50s time trace on the left and the superposed cutouts extracted by a spike sorting algorithm (detailed in methods). From top to bottom, the figure shows the typical voltage time trace and spike forms within the long and distant axonal microchannel; the synaptic middle chamber (without somas); the short neurite (dendrites and axons) microchannels; and the somatic chamber.

Interestingly, the activity in the somatic chamber resembled that of the control samples in terms of spike shape and spike rate (Fig.3a). When the activity within the somatic chamber was isolated, the spiking rate closely followed the trend observed in control samples, ranging from 0.9 to 2.5Hz from 6 to 11days (Fig. S2), which is a typical value for hippocampal neurons. Thus, the areas containing the soma (within the random and organized NNs, respectively) exhibited comparable spike patterns regardless of the growth condition (opened or confined). Previous works reported similar differences between somatic and axonal spikes (without the microfluidic environment)42, which agrees with our observations and further highlights the physiological relevance of the observations. Here, the microchannels provided a unique way to identify and study neurite activity in proximal and distant areas, presumably corresponding to dendrites and axons, respectively.

The cross-correlation (CC) analysis (Fig.5) provided a functional cartography of the random and organized networks at several stages of their development (detailed in materials and methods, and see Fig. S3 for the dual-somatic chamber). For the control sample, correlations became significant at DIV11 between electrode clusters randomly dispersed over the whole sample (Fig.5a). Their amplitude was weak but remained constant over the network. In contrast, cross-correlations were spatially defined and more intense in term of amplitude and number within the organized networks (Fig.5b), also emerging earlier at DIV5.

Correlations within random and organized NNs. The cross-correlation matrix (CCM) was extracted from the 60 recording channels of the MEAs during the culture time (one electrode per line and per column; bin size<5ms). From top to bottom: CCM obtained at DIV11 and DIV14 for the control sample (left) and at DIV6 and DIV11 for the microfluidic sample (right). (Bottom right) Schematics illustrate the position of the recording channels within the microfluidic compartments. The bottom colored bar is then used in the (xy) axes of the CC maps to highlight the position of each microelectrode: (filled, teal) in the large chamber containing all the soma, (filled, red) in the microchannels and the synaptic chamber, and (open, teal) in the empty large chamber for axon outputs only (no soma).

Maximal values were found within the long and distal microchannels, with mean correlation coefficients close to 1 and 0.5, respectively. Indeed, strong correlations can be expected when measuring spike propagation within the axonal compartment, which is more highlighted within the distal and long microchannels.

Somatic signals were correlated with some electrodes located in the microchannels and the synaptic chamber, revealing long-range synchrony as well (Fig.5b). Their amplitudes increased with time (Fig.5d), revealing a reinforcement of network synchrony and connectivity, especially between the microchannels and the synaptic and somatic chambers. They were concomitant with a modulation of short-range correlations, which became higher between neighboring electrodes. This effect could have several origins, such as time selection of the master node and a reinforcement of selected connections. Additionally, it could result from inhibitory activity, glutamatergic and GABAergic neurons being expected in similar proportions in our culture, and their maturation could explain the appearance of silent electrodes at the final stage of electrical maturation.

Thus, groups of spatially confined electrodes revealed a synchronization of the subpopulation consistent with the geometrical constraints. Somatic and synaptic chambers and neurite microchannels exhibited specific spiking patterns (Figs.3 and 4) and correlation landscapes (Fig.5) that enabled the identification of each network compartment. In that way, microfluidic circuits are capable of inducing significant differences in the spatiotemporal dynamics of in vitro neural networks.

The short-term cross-correlations between each microelectrode were then assessed to track signal propagation between each compartment (Fig.6). Figure6a first assesses the connectivity of the somatic chamber. The main feature was that there were higher correlation and synchrony levels between soma and neurite than between somas. Most of the correlations occurred with the proximal microchannels. This explains the synchrony and correlation between proximal neurites (Fig.6b, purple column). The analysis also reveals long-range correlations with both the synaptic chamber and the axonal microchannels (orange and yellow columns). Thus, somatic signals efficiently activated the emission of spikes within distant axonal microchannels (up to a few mm).

Immediate correlation of spike trains within the organized NN. Mapping of short-term correlations (signal delay is2.5ms max) extracted from the MEA recordings of 11-day-old microfluidic NNs. Arrows represent a significant correlation between the 5ms-binned spike trains of two electrodes. The maximum delay between correlated electrodes is2.5ms. The four panels (ad) distinguish the interactions between (a) somas and neurites (blue arrow) and (bd) along neurites. (b) Correlation between the electrodes of the same MEA column but within different microchannels (purple arrows), showing backward and forward propagation between adjacent neurite channels or synchrony between proximal neurites resulting from the same excitation. (c) Correlation between electrodes of the same MEA line, thus within the same or aligned microchannels (green arrows), showing straight spike propagation; (d) Correlation between each electrode located within the microchannels and the synaptic chamber (red arrows), showing entangled neurite-neurite interactions. Straight correlations (green arrows, in Panel (c) are excluded.

Between different microchannels (Fig.6b, purple arrow), the correlations appeared strongest in the synaptic chamber (n=3.9 per electrode, orange column), where there was no physical barrier to restrict communication between neurites. Then, the correlation within different microchannels (purple, yellow, red columns) could reveal backward and forward propagation between adjacent neurite channels or synchrony resulting from the same excitation. This could stem, respectively, from closed loops of neurites (Fig.1c) or the proximity between microchannels and the somatic or synaptic chambers. The number of these correlations was higher for proximal microchannels, both in terms of number and length of correlation, up to electrodes separated by 5 pitches (n+5). If we consider the neural architecture as we designed it, this would suggest a higher level of connectivity for the dendrites and proximal axons (both present within the short microchannels) than for the distant axon (long microchannel). Further studies should assess this point with immunostaining to identify dendrites and axons and excitatory and inhibitory neurons, for instance. In fact, we must not neglect other possibilities, such as the impact of dendritic signals (e.g. EPSPs and IPSPs from inhibitory and excitatory neurons), which may hide activity within distant microchannels.

Figure6c shows straight propagation along aligned microchannels (green arrow) and presumably along the same or connected neurites. Again, more signals propagated to the left than to the right side of the synaptic chamber, which agrees with the expected position of the dendrites and axons and the filtering effect of the synaptic chamber. These propagations were dominated by short-distance correlations, essentially between neighboring electrodes (n+1 or n+2). Long-range interactions were, however, clearly distinguished between misaligned electrodes (Fig.6d, red arrow), with each active site being correlated on average with three distant (>n+1) electrodes and one neighboring (n+1) electrode. The spatial range of the correlation reached several millimeters (up to n+6). Generally, those panels show that straight propagation involved axonal channels, while propagation between dendrites and within the synaptic chamber was more spatially distributed, which is indeed as expected for hippocampal neurons. The design architecture of the microfluidic NN is functionally relevant.

The directionality of neural communications was then assessed by picturing the delayed cross-correlations (between 5 and 25ms). Thus, the correlated spike trains were expected to share a similar origin. We assume that a positive delay between correlated electrodes (A and B) indicates the direction of propagation (from A to B), regardless of the propagation pathway (possibly indirect with hidden nodes). Under this assumption, most of the short-range correlations observed previously were suppressed, while long-range correlations are numerous despite the distance between electrodes and the background noise (Fig.7).

Long-term correlation of spike trains within organized NN. Mapping of delayed correlations (signal delay is25ms max) extracted from the MEA recordings of 11-day-old microfluidic NNs. Arrows represent significant correlation with a delay between25ms and 25ms between 5-ms-binned spike trains of two electrodes. Short-term correlations with a delay less than 5ms are excluded. The four panels (ad) distinguish the interactions between (a) somas and neurites (blue arrow) and (bd) along neurites. The same representation as in Fig.6 is used for the purple, green and red arrows.

The temporality of events was clear within aligned microchannels (Fig.7c). Signals propagated from the short to the long microchannels toward the axons and seemed to originate from the somatic chamber (Fig.7a). Additionally, the same somatic electrode seemed to activate several neurite channels, which could explain the correlation observed between those microchannels (Fig.7b). Within adjacent and parallel microchannels (Fig.7b), signals could be carried by the same neurites (in a closed loop configuration), but the delay (525ms) suggests indirect communications, presumably by dendrites. As illustrated in Fig.7d, communications were highly intricate between short and long channels, which confirms efficient neurite mixing within the synaptic chamber. The directionality was also mitigated, as 50% of propagations occurred in both directions for the purple and red columns (short and long microchannels). This dual directionality agrees with the emergence of both input and output nodes in the same somatic chamber (greenandblue columns Fig.7a). For that reason, we can barely distinguish backpropagation events, if any, and their impact on signal processing within such microfluidic circuits.

Interestingly, we observed only one efferent node and few (34) afferents (output and input nodes, respectively) for both conditions within organized and random NNs (Fig.7a and Fig. S4, respectively). However, the number of correlated spike trains was significantly reduced in control cultures of the same age, which confirms intense activity underlying the accelerated maturation within the microfluidic environments. The microchannels are shown to enhance the detection efficiency and amplitude of recorded signals. However, high levels of activity and synchrony were also observed in the wider synaptic chamber, which excludes an isolated effect of the enhanced detection efficiency within the microchannels. Differences in encoding properties between random and organized NNs are thus demonstrated, leveraging a high level of connectivity. While somas and neurites could be isolated, this analysis indeed underlines the complexity of neural communications and the rich encoding possibility even within a basic one-node architecture.

View original post here:

Portrait of intense communications within microfluidic neural ... - Nature.com

New Optical Neural Network Filters Info before Processing – RTInsights

The system is similar to how human vision works by discarding irrelevant or redundant information, allowing the ONN to quickly sort out important information.

Cornell University researchers have developed an optical neural network (ONN) that can significantly reduce the size and processing time of image sensors. By filtering out irrelevant information before a camera detects the visual image, the ONN pre-processor can achieve compression ratios of up to 800-to-1, equivalent to compressing a 1,600-pixel input to just four pixels. This is one step closer to replicating the efficiency of human sight.

The ONN works by processing light through a series of matrix-vector multiplications to compress data to the minimum size needed. The system is similar to how human vision works by discarding irrelevant or redundant information, allowing the ONN to quickly sort out important information, yielding a compressed representation of the original data. The ONN also offers potential energy savings over traditional digital systems, which save images and then send them to a digital electronic processor that extracts information.

The researchers tested the optical neural network image sensor with machine-vision benchmarks, used it to classify cell images in flow cytometers, and demonstrated its ability to measure and identify objects in 3D scenes. They also tested reconstructing the original image using the data generated by ONN encoders that were trained only to classify the image. Although not perfect, this was an exciting result, as it suggests that with better training and improved models, the ONN could yield more accurate results.

Their work was presented in a paper titled Image Sensing with Multilayer, Nonlinear Optical Neural Networks, published in Nature Photonics.

See also: Using Photonic Neurons to Improve Neural Networks

ONNs have potential in situations where low-power sensing or computing is needed, such as in image sensing on satellites, where devices that use very little power are required. In such scenarios, the ability of ONNs to compress spatial information can be combined with the ability of event cameras to compress temporal information, as the latter is only triggered when the input signal changes.

Read this article:

New Optical Neural Network Filters Info before Processing - RTInsights

Tuning and Optimizing Your Neural Network | by Aye Kbra … – DataDrivenInvestor

A guide to tuning and optimizing neural networks. 13 min read

Table of Contents 1. Understanding the Basics of Neural Networks 2. Importance of Tuning and Optimizing Neural Networks 3. Training, Validation, and Test Sets: An Overview 4. Hyperparameters and Their Role in Neural Networks 5. Tuning Neural Network Hyperparameters 6. Strategies for Efficient Hyperparameter Tuning 7. Regularization Techniques for Avoiding Overfitting 8. Optimizing Neural Networks with Backpropagation 9. Advanced Optimization Techniques 10. Utilizing Hardware for Network Optimization 11. Debugging Neural Networks 12. Staying Current with Neural Network Optimization Trends

The first step in tuning and optimizing your neural network is to understand the basic principles of how neural networks work. In this section, well delve into the foundational concepts, including neurons, layers, and weights.

A neuron in a neural network is a mathematical function that collects and classifies information according to a specific architecture. The neuron takes in inputs, multiplies these by their respective weights, and passes them into an activation function to produce an output.

The layers of a neural network consist of an input layer, hidden layers, and an output layer. The input layer receives raw input while the output layer makes final decisions or predictions. Hidden layers fine-tune the input data.

Weights are the crux of the neural network as they adjust during the training process, helping your network learn from the errors it makes.

Originally posted here:

Tuning and Optimizing Your Neural Network | by Aye Kbra ... - DataDrivenInvestor

Simulation analysis of visual perception model based on pulse … – Nature.com

Neural network dynamics

The channels for each pulse element to receive external stimulus input in PCNN include feedback input channels and connection input channels. Moreover, the internal active item U of the pulse element is modulated by the nonlinear multiplication of the inverse feed input item F and the connection input item. U stands for nonlinear modulation matrix.Whether the pulse is issued in PCNN is related to the internal activity item U and threshold E of the neuron. Each pulse coupling kernel has a size, and the size of the six pulse coupling kernels in layer C1 is 55. The function f represents the pixel value of the coupled pulse image.The pulse coupling kernel is used to slide on the input data f(i, j) according to a fixed step size u(i) to make the pulse coupling kernel calculate the pulse coupling on the local data f(i).

$$ frac{1}{1 - n}sum {frac{f(i,j) - u(i)}{{f(m) - f(n)}} < n} $$

(1)

$$ 1 - |x| > frac{1}{1 - n}ln |x - f(j - 1)| $$

(2)

In the process of sparse decomposition 1-|x|, the high-frequency coefficient of multi-scale decomposition represents the detailed information such as region boundary and edge of multi-source image, and the human visual system is sensitive to the detailed information such as edge. How to construct high frequency coefficient perception strategy and extract significant high frequency coefficient is very important to improve the quality of perception image. Combined with the characteristics of high frequency component of source image w(s, t), image quality evaluation factor p(x, y) is considered to construct perception strategy.

$$ w(s,t) - w(s,0) = w(s - 1,t - 1) $$

(3)

$$ sum {p(x,y) - p} (x - x^{2} ) < p[n - 1] $$

(4)

In PCNN network, each pixel in the image is equivalent to an impulse element. At this point, the threshold E increases rapidly through the feedback input, causing the pulse element to stop transmitting pulses. The threshold k(x)/k(y) begins to decay over time, and when it is again smaller than the internal active term, the pulse element fires again, and so on.

$$ sum {k(x)} /k(y) < log (x - x^{2} - y - 1) $$

(5)

The algorithm first performs variance-based enhancement on color images, then uses the pulse-coupled neural network with spatial adjacency and similar brightness feature clustering, locates the noise points by comparing the difference between the ignition times of different image pixels, and finally follows the rules similar to the vector median filtering algorithm. Since each pixel will calculate the similarity with multiple seed points, the seed point that is most similar to the pixel point, that is, the corresponding minimum distance, is taken as the clustering center, and then the number of the seed point is given on the pixel point. Finally, the color value and coordinate value of the seed point and all pixel points are added and averaged to obtain the new cluster center in Fig.1.

Neural network clustering sample fusion.

The registered right and left focus samples were fused. Effective fusion results should result in a clear left and right image, that is, restore the contrast and sharpness of the respective mode paste areas in the two images. In order to make it as consistent as possible with the physical standard graph, we choose the correlation coefficient between the perceptual result and the physical standard graph as one of the measurement indexes. In addition, the definition of the average gradient balanced image, the scale of the standard deviation balanced image and the information degree of the entropy balanced image are discussed. When the pulse coupling kernel slides to the entire input data, only local data is extracted each time for feature calculation, which reflects the local connectivity of PCNN and greatly speeds up the calculation speed. In the sliding process, the parameters of each pulse coupling core remain unchanged, which means that each pulse coupling core only observes the features it wants to obtain through its own parameters, which greatly reduces the number of parameters and reflects the parameter sharing property of PCNN.

Based on the chaotic sequence and cyclic/block diagonal splitting structure of homomorphic filtering, aiming at the problem of poor reconstruction performance and high computational complexity, this paper proposes a deterministic measurement matrix optimization strategy based on modified gradient descent to minimize the correlation between observation matrix and projection matrix. Then the point (x, y) belongs to the foreground, otherwise belongs to the background. Compared with single threshold segmentation miu(r, g, b), double threshold segmentation can effectively reduce misjudgment.

$$ miu(r,g,b) = sqrt {(miu.exp (r,g) - miu.log (r,b)) - 1} $$

(6)

$$ log (i + j) - log (i - j) - 1 < i - j $$

(7)

Since the point cloud data log(i+j) has no clear connection relationship, the two-sided filtering algorithm can not be directly applied to the point cloud surface denoising. Bilateral filtering algorithm mainly involves point V. In this paper, the method is used to calculate the adjacent points of discrete point V, and the normal calculation of the vertex is obtained by optimizing a secondary energy term of the adjacent points.The essence of visual perception is that visual perception is divided into several regions according to some similarity principles, so the quality of segmented images can be judged by using the uniformity in each region. Therefore, the optimal segmentation result can be identified by calculating the 1/(1i) value of the binary image, so as to realize the automatic selection of the optimal segmentation result exp(1/d).

$$ frac{1 - i}{i}Z(i - j - k) = frac{1}{1 - i} + frac{1}{1 - j} + frac{1}{1 - k} + 1 $$

(8)

$$ exp ( - frac{miu(x + y - 1)}{{2d}})/exp ( - frac{x + y}{d}) < 1 $$

(9)

Coupling connection miu(x+y-1)/d refers to the operation mechanism of PCNN when the connection strength coefficient is not equal to 0. In this case, the element not only receives external excitation, but also receives feedback input information of the neighborhood pulse element. In this case, each pulse element in the model is coupled to each other. In the case of coupling connection, using coupling connection input L to regulate feedback input F is the key to communication between pulse elements in the coupled PCNN model.

$$ sum {|x + p(x - 1)|} sum {|x - p(x - 1)|} in w(x,t) $$

(10)

In the clipping method, the boundary p(x-1) of one grid is used to cut another grid in the overlapping area w(x, t), and then a new triangle is generated on the common boundary to make the two grids join together. This method will produce a large number of small triangles at the common boundary due to clipping. Moreover, this method only uses the vertices in one mesh in the overlapping region, and the vertices in the other mesh are completely abandoned. For the mesh with large overlapping region, the overlapping region of the two grids cannot be used to correct the vertices. At the same time, due to the error in the registration process of multi-slice grids, the boundary of one grid needs to be projected to another grid before clipping in Fig.2.

Homomorphic filtering results of visual images.

Since the image fusion rules determine the final perception result, it is better to choose the appropriate fusion compliance rules that are more in line with the perception expectation to design the image perception experiment. We know that the image after pyramid decomposition will get the low frequency subgraph of near similar information of feature image and the high frequency subgraph of detail feature of feature image. Therefore, designing different perception rules for different features can better achieve high-quality image perception. For the same experimental image, if the entropy of the segmentation image obtained by a certain method is relatively large, it indicates that the performance of the segmentation method is better. In general, the segmentation effect of the proposed method is better than other segmentation methods. Whether it is objective evaluation criteria or direct observation of segmentation effect, it can be noted that the protection of color edge details in the center area is better than other methods.

Pulse coupling feed input is the main input source received by pulse elements, and neighboring pulse elements can influence the feed input signal of pulse elements through link mode. The external stimulus is received by the feed input domain and then coupled with the adjacent pulse element pulse signal received by the link input domain and sent to the internal activity item. The value of the internal activity term gradually increases with the cycle, while the dynamic threshold gradually decreases with the cycle t(i, j), and the value of the internal activity term is compared with the dynamic threshold for each cycle s(i ,j).

$$ A + B*t(i,j) + C*s(i,j) < 1 $$

(11)

$$ 10log ;(2.5^{ wedge } x - 2x - 1)^{ wedge } 2 < 1/log ;(2^{ wedge } x - x) $$

(12)

In contrast log(2^xx), as a simplified and improved model of PCNN model, LSCN (Long and Short Sequence Concerned Networks) continuously simplifies the input signal acquisition mechanism, and the total amount of undetermined parameters is greatly reduced. There are three leakage integrators in the traditional PCNN model, which need to perform two pulse coupling operations. In the LSCN model, there are also three leakage integrators, but only one pulse coupling operation is required. This determines that the time complexity of the LSCN model is lower than that of the traditional model, and it can be seen that the relationship between internal activity items and external incentives in this model is more direct. Not only that, different from traditional PCNN, the iteration process h(i, j)/x of LSCN model is automatically stopped rather than manually set, which is more convenient to operate in multiple iterations.

$$ sqrt {Delta h_{x} (i,j)/x + Delta h_{y} (i,j)/y + Delta h_{z} (i,j)/z} = 1 $$

(13)

$$ 1 - ln sum {|p(x) - p(x - 1)|} - ln p(x) in p(1 - x) $$

(14)

In the process of perception at this level p(x)p(x1), an independent preliminary judgment is made on each image and relevant conclusions are set up, and then each judgment and conclusion are perceived, so as to form the final joint judgment. The amount of data processed by the decision level perception method is the least among the three levels, and it has good fault tolerance and real-time performance, but it has more pre-processed data.

$$ X(a,b,c) = R(a,b)/c + G(c,b)/a + B(a,c)/b $$

(15)

Firstly, feature extraction X(a, b, c) is carried out on the original image, and then these features are perceived. Because the object perceived at this level is not the image but the characteristics of the image, it compreses the amount of data required to be processed to a certain extent, improves the efficiency and is conducive to real-time processing. The candidate regions, classification probabilities, and extracted features generated by the PCNN network are then used to train the cascade classifier. The training set at the initial time contains all positive samples and the same number of negative samples randomly sampled. The RealBoost classifier is followed by pedestrian classification.

The audience dataset labels age and gender disaggregated information together, suggesting that the model is actually a multi-task model, but does not explore the intrinsic relationship between the two tasks for better detection results. The model in Fig.3 had a gender identification accuracy of 66.8 percent on the audience dataset. However, these completely abandoned significance graphs actually contain some important significance information, which will cause the significance detection effect of PCNN model to be inaccurate. Therefore, it is necessary to reasonably perceive the significant information at each scale based on the significant information at the minimum entropy scale.Therefore, based on the saliency information at the minimum entropy scale, this paper takes the reciprocal of the corresponding entropy at other scales as the contribution rate to perceive the saliency information at other scales, so as to propose a multi-scale final saliency map determination method.

Information annotation of pulse coupling data set.

The visual boundary coefficient is more suitable for describing the difference between the visual boundary and the visual frame, and image enhancement is convenient for processing visual boundary detection. Based on the diffusion principle of nonlinear partial differential equation, the model can control the diffusion direction by introducing appropriate diffusion flux function, and can also be combined with other visual boundary detection methods. In order to verify that the superpixel-based unsupervised FCM color visual perception method proposed in this chapter can obtain the best segmentation effect, 50 images were selected from BSDS500 as experimental samples. Since the method proposed in this chapter can automatically obtain the cluster number C value, while the traditional clustering algorithm uses a fixed C value for each image, the fixed value of C and the method of automatically obtaining the cluster number C value will be used for the experiment respectively. The algorithm requires three essential parameters, namely, the weighting index, the minimum error threshold and the maximum number of iterations, which are respectively 2, 15 and 50 in this experiment, and the adjacent window size is set to 3*3.

As can be seen in Fig.4, although the perceptual image obtained by the maximum value method is optimal in the optical brightness of the image, its edge has more obvious "sawtooth" phenomenon and is more blurred. Compared with the source image, the perception image obtained by the discrete wavelet transform method has obvious shortcomings in saturation and brightness. From the perspective of visual effect, the perceptual image obtained by the visual perception transformation method has obvious edge oscillation effect. In contrast, the proposed image perception algorithm based on compressed sensing theory has achieved good visual effects in terms of clarity, contrast and detail representation. Visual boundary detection method based on visual boundary coefficient has certain shortcomings in practical application, if the visual boundary neighborhood between frame and frame shear in irregular change, the visual border visual boundary coefficient decreases, and it is also possible for video clips in the visual dithering and make the visual boundary coefficient increases, this could reduce the detection performance of the algorithm.

Image enhancement perception distribution.

If the minimum value of the interval in which the previous frame is located is equal to the minimum value of the minimum value of all subintervals in the search window, a further comparison is made in the subinterval in which the current frame is located. Since the search window of the current frame does not necessarily coincide exactly with the subinterval, the minimum value of the subinterval of the current frame boundary needs to be recalculated when determining the minimum value of the different subintervals (even without recalculation, the impact is limited).

Without the visual perception shared pulse coupling layer, P-Net's face detection and pedestrian detection will need to extract features from 224224 pixel images respectively, and the time spent training these two tasks will be doubled, and R-Net with 448448 pixel input will take even more time. At the same time, the internal connection of face detection and pedestrian detection has a special, most can locate face detection to the pedestrian detection box, so will face detection and pedestrian detection joint training can improve their accuracy. Obviously, it is simple and fast to segment PMA (Plane Moving Average) sequences according to 0 points, but many long motion patterns will be generated. Long motion mode is not conducive to key frame extraction, because it is difficult to express visual content according to long motion mode. Secondly, the long movement mode expressed by the triangular model will have a large error and is not accurate. At this point, we can separate the long motion mode into multiple motion modes. The method of separation is to determine the minimum point in the long motion pattern.

It can be seen that the performance of visual boundary detection using visual boundary coefficient and standard histogram intersection method has its own advantages and disadvantages, and the overall performance is equivalent. For the data set in Fig.5, the fixed min value detection method using visual boundary coefficients shows different properties. In the face of common noise attacks, the improved PCNN model achieves a higher Area Under Curve (AUC) value, which also indicates that the improved model has more robust robustness. If the cost of false visual boundary detection is equal to that of missed visual boundary detection, the visual boundary detection method using visual boundary coefficient is slightly inferior to the standard histogram intersection method on movie and video data sets. However, on the video dataset, the visual boundary detection method using visual boundary coefficients is slightly better than the standard histogram intersection method. If the cost of false and missed visual boundaries is not equal, the opposite is true. In general, the method using symmetric weighted window frame difference and moving average window frame difference is more stable and reliable than the method using 1/2- symmetric weighted window frame difference and 1/2- moving average window frame difference.

Parameter adjustment of boundary coefficient of visual perception.

See more here:

Simulation analysis of visual perception model based on pulse ... - Nature.com

Is running AI on CPUs making a comeback? – TechHQ

If somebody told you that a refurbished laptop could eclipse the performance of an NVIDIA A100 GPU when training a 200 million-parameter neural network, youd want to know the secret. Running AI routines on CPUs is supposed to be slow, which is why GPUs are in high demand, and NVIDIA shareholders are celebrating. But maybe its not that simple.

Part of the issue is that the development and availability of GPUs, which can massively parallelize matrix multiplications, has made it possible to brute force progress in AI. Bigger is better when it comes to both the amount of data used to train neural networks and the size of the models, reflected in the number of parameters.

Considering state-of-the-art large language models (LLMs) such as OpenAIs GPT-4, the number of parameters is now measured in the billions. And training what is, in effect, a vast, multi-layered equation by first specifying model weights at random and then refining those parameters through backpropagation and gradient descent is now firmly GPU territory.

Nobody runs high-performance AI routines on CPUs, or at least thats the majority view. The growth in model size, driven by the gains in accuracy, has led users to overwhelmingly favor much faster GPUs to carry out billions of calculations back and forth.

But the scale of the latest generative AI models is putting this brute force GPU approach to the test. And many developers no longer have the time, money, or computing resources to compete fine-tuning billions of artificial neurons that comprise the many-layered networks.

Experts in the field are asking if theres another, more efficient way of training neural networks to perform tasks such as image recognition, product recommendation, and natural language processing (NLP) search.

Artificial neural networks are compared to the workings of the human brain. But the comparison is a loose one as the human brain operates using the power of a dim light bulb, whereas state-of-the-art AI models require vast amounts of power, have worryingly large carbon footprints, and require large amounts of cooling.

That being said, the human brain consumes a considerable amount of energy compared with other organs in the body. But its orders of magnitude GPU-beating capabilities stem from the fact that the brains chemistry only recruits the neurons that it needs rather than having to perform calculations in bulk.

AI developers are trying to mimic those brain-like efficiencies in computing hardware by engineering architectures known as spiking neural networks. Neurons behave more like accumulators and fire only when repeatedly prompted. But its a work in progress.

However, its long been known that training AI algorithms could be made much more efficient. Matrix multiplications assume dense computations, but researchers have shown a decade ago that just picking the top ten percent of neuron activations will still produce high-quality results.

The issue is that to identify the top ten percent you would still have to run all of those sums in bulk, which would remain wasteful. But what if you could look up a list of those most active neurons based on a given input?

And its the answer to this question that opens up the path to running AI on CPUs, which is potentially game-changing as the observation that a refurbished laptop can eclipse the performance of an NVIDIA A100 GPU hints at.

So what is this magic? At the heart of the approach is the use of hash tables, which famously run in constant time (or thereabouts). In other words, searching for an entry in a hash table is independent of the number of locations. And Google puts this principle to work on its web search.

For example, if you type Best restaurants in London into Google Chrome, that query thanks to hashing, which turns the input into a unique fingerprint provides the index to a list of topical websites that Google has filed away at that location. And its why, despite having billions of websites stored in its vast index, Google can deliver search results to users in a matter of milliseconds.

And, just as your search query in effect provides a lookup address for Google, a similar approach can be used to identify which artificial neurons are most strongly associated with a piece of training data, such as a picture of a cat.

In neural networks, hash tables can be used to tell the algorithm which activations need to be calculated, dramatically reducing the computational burden to a fraction of brute force methods, which makes it possible to run AI on CPUs.

In fact, the class of hash functions that turn out to be most useful are dubbed locally sensitive hash (LSH) functions. Regular hash functions are great for fast memory addressing and duplicate detection, whereas locally sensitive hash functions provide near-duplicate detection.

LSH functions can be used to hash data points that are near to each other in other words, similar into the same buckets with high probability. And this, in terms of deep learning, dramatically improves the sampling performance during model training.

Hash functions can also be used to improve the user experience once models have been trained. And computer scientists based in the US at Rice University, Texas, Stanford University, California, and from the Pocket LLM pioneer ThirdAI, have proposed a method dubbed HALOS: Hashing Large Output Space for Cheap Inference, which speeds up the process without compromising model performance.

As the team explains, HALOS reduces inference into sub-linear computation by selectively activating only a small set of likely-to-be-relevant output layer neurons. Given a query vector, the computation can be focused on a tiny subset of the large database, write the authors in their conference paper. Our extensive evaluations show that HALOS matches or even outperforms the accuracy of given models with 21 speed up and 87% energy reduction.

Commercially, this approach is helping merchants such as Wayfair an online retailer that enables customers to find millions of products for their homes. Over the years, the firm has worked hard to improve its recommendation engine, noting a study by Amazon that even a 100-millisecond delay in serving results can put a noticeable dent in sales.

And, sticking briefly with online shopping habits, more recent findings published by Akamai report that over half of mobile website visitors will leave a page that takes more than three seconds to load food for thought as half of consumers are said to browse for products and services on their smartphones.

All of this puts pressure on claims that clever use of hash functions can enable AI to run on CPUs. But the approach more than lived up to expectations, as Wayfair has confirmed in a blog post. We were able to train our version three classifier model on commodity CPUs, while at the same time achieve a markedly lower latency rate, commented Weiyi Sun Associate Director of Machine Learning at the company.

Plus, as the computer scientists described in their study, the use of hash-based processing algorithms accelerated inference too.

Here is the original post:

Is running AI on CPUs making a comeback? - TechHQ

AI’s Transformative Impact on Industries – Fagen wasanni

Artificial intelligence (AI) has made remarkable progress in recent years, revolutionizing various industries and capturing the imagination of experts worldwide. Several notable research projects have emerged, showcasing the immense potential of AI and its transformative impact on different sectors.

One prominent project is DeepMinds AlphaFold, an AI system that accurately predicts protein folding structures using deep learning algorithms. This breakthrough has the potential to revolutionize bioinformatics and accelerate drug discovery processes by enabling a better understanding of protein structures and their functions.

In the healthcare industry, IBM Watsons cognitive computing capabilities have paved the way for personalized medicine and improved diagnostics. Watson can analyze vast amounts of patient data, medical research, and clinical guidelines to provide evidence-based treatment recommendations. Its application in oncology has shown promising results, aiding doctors in making informed decisions and improving patient outcomes.

Another notable project is Google Brain, an AI system introduced in 2011. Google Brain focuses on open learning and aims to emulate the functioning of the human brain as closely as possible. It has achieved significant success in simulating human-like communication between AI entities, demonstrating the learning capabilities and adaptability of AI systems.

Google Brains Transformer, a neural network architecture, has revolutionized natural language processing and machine translation. Its attention mechanism allows the model to focus on relevant parts of the input sequence, overcoming the limitations of traditional neural networks. The Transformer has significantly improved translation quality and found success in various NLP tasks and computer vision tasks.

Lastly, Google DeepMinds AlphaGo is a milestone in AI, beating world champions in the game of Go and pushing the boundaries of AI in strategic board games. The development of AlphaGo Zero, which relies solely on reinforcement learning, marked a true breakthrough in AI mastery.

These projects demonstrate the transformative impact of AI on various industries, from healthcare to language processing to strategic games. AI continues to push boundaries and open new possibilities in different fields, promising a future of boundless possibilities.

Read the original post:

AI's Transformative Impact on Industries - Fagen wasanni

ASU researchers bridge security and AI – Full Circle

Fast-paced advancements in the field of artificial intelligence, or AI, are proving the technology is an indispensable asset. In the national security field, experts are charting a course for AIs impact on our collective defense strategy.

Paulo Shakarian is at the forefront of this critical work using his expertise in symbolic AI and neuro-symbolic systems, which are advanced forms of AI technology, to meet the sophisticated needs of national security organizations.

Shakarian, an associate professor of computer science in the School of Computing and Augmented Intelligence, part of the Ira A. Fulton Schools of Engineering at Arizona State University, has been invited to attend AI Forward, a series of workshops hosted by the U.S. Defense Advanced Research Projects Agency, or DARPA.

The event includes two workshops: a virtual meeting that took place earlier this summer and an in-person event in Boston from July 31 to Aug. 2.

Shakarian is among 100 attendees working to advance DARPAs initiative to explore new directions for AI research impacting a wide range of defense-related tasks, including autonomous systems, intelligence platforms, military planning, big data analysis and computer vision.

At the Boston workshop, Shakarian will be joined by Nakul Gopalan, an assistant professor of computer science, who was also selected to attend the event to explore how his research in human-robot communication might help achieve DARPAs goals.

In addition to his involvement in AI Forward, Shakarian is preparing to release a new book in September 2023. The book, titled Neuro-symbolic Reasoning and Learning, will explore the past five years of research in neuro-symbolic AI and help readers understand recent advances in the field.

As Shakarian and Gopalan prepared for workshops, they took a moment to share their research expertise and thoughts on the current landscape of AI.

Explain your research areas. What topics do you focus on?

Paulo Shakarian: My primary focus is symbolic AI and neuro-symbolic systems. To understand them, its important to talk about what AI looks like today, primarily as deep learning neural networks, which have been a wonderful revolution in technology over the last decade. Looking at problems specifically relevant to the U.S. Department of Defense, or DoD, these AI technologies were not performing well. There are several challenges, including black box models and their explainability, systems not being inherently modular because theyre trained end-to-end, and the enforcement of constraints to help avoid collisions and interference when multiple aircrafts share the same airspace. With neural networks, theres no inherent way in the system to enforce constraints. Symbolic AI has been around longer than neural networks, but it is not data-driven, while neural networks are and can learn symbols and repeat them back. Traditionally, symbolic AIs abilities have not been demonstrated anywhere near the learning capacity of a neural network, but all the issues Ive mentioned are shortcomings of deep learning that symbolic AI can address. When you start to get into these use cases that have significant safety requirements, like in defense, aerospace and autonomous driving, there is a desire to leverage a lot of data while accounting for safety constraints, modularity and explainability. The study of neuro-symbolic AI uses a lot of data with those other parameters in mind.

Nakul Gopalan: I focus on the area of language grounding, planning and learning from human users for robotic applications. I attempt to use demonstrations that humans provide to teach AI systems symbolic ideas, like colors, shapes, objects and verbs, and then map language to these symbolic concepts. In that regard, I also develop neuro-symbolic approaches to teaching AI systems. Additionally, I work in the field of robot learning, which involves implementing learning policies to help robots discover how to solve specific tasks. Tasks can range from inserting and fastening bolts in airplane wings to understanding how to model an object like a microwave so a robot can heat food. Developing tools in these large problem areas in machine learning and artificial intelligence can enable robots to solve problems with human users.

Tell me about your research labs. What research are you currently working on?

PS: The main project Ive been working on in my lab, Lab V2, is a software package we call PyReason. One of the practical results of the neural network revolution has been really great software like PyTorch and TensorFlow, which streamline a lot of the work of making neural networks. Google and Meta put considerable effort into these pieces of software and made them free to everyone. Weve noticed in neuro-symbolic literature that everyone is reinventing the wheel, in a sense, by creating a new subset of logic for their particular purposes. Much of this work already has copious amounts of literature previously written on it. In creating PyReason, my collaborators and I wanted to create the best possible logic platform for working with machine learning systems. We have about three or four active grants with it, and people have been downloading it, so it has been our primary work. We wanted to create a very strong piece of software to enable this research, so you dont have to keep reimplementing old bits of logic. This way its all there, its mature and relatively bug-free.

NG: My lab, the Logos Robotics Lab, focuses on teaching robots a human approach to learning and solving tasks. We also work on representations for task solving to understand how robots can model objects so they can solve the tasks we need robots to solve. Like learning how to operate a microwave, for example, and understanding how to open its door and put an object inside. We use machine learning techniques to discover robots behavior and focus on teaching robots tasks from human users to sample efficient machine learning methods. Our team learns about object representations such as modeling microwaves, toasters and pliers to understand how robots can use them. One concept we work on is tactile sensing, which helps to recognize objects and use them for solving tasks by touch. We do all this with a focus on integrating these approaches with human coworker use cases so we can demonstrate the utility of these learning systems in the presence of a person working alongside the robot. Our work touches practical problems in manufacturing and socially relevant problems, such as introducing robots into domains like assisted living and nursing.

What initially drew you to engineering and drove you to pursue work in this field?

PS: I had an interesting journey to get to this point. Right out of high school, I went to the United States Military Academy at West Point, graduated, became a military officer and was in the U.S. Armys 1st Armored Division. I had two combat tours in Iraq, and after my second combat tour, my unit sent me on a three-month temporary assignment to DARPA as an advisor because I had combat experience and a technical degree a bachelors degree in computer science. At DARPA, I learned how some of our nations top scientists were applying AI to solve relevant defense problems and became very interested in both intelligence and autonomy. Being trained in military intelligence, Ive worked in infantry and armor units to understand how intelligence assets were supporting the fight, and I saw that the work being done at DARPA was lightyears beyond what I was doing manually. After that, I applied to a special program to go back to graduate school and earned my doctoral degree, focusing on AI. As part of that program, I also taught for a few years at West Point. After completing my military service, I joined the faculty at ASU in 2014.

NG: I have been curious about learning systems related to control and robotic applications since my undergraduate degree studies. I was impressed by the capability of these systems to adapt to a human users needs. As for what drew me to engineering, I was always fascinated by math and even competed in a few math competitions in high school. A career in engineering was a way for me to pursue this interest in mathematics for practical applications. A common reason for working in computer science research is its similarity to the mathematics field. The computer science field can solve open-ended theoretical problems while producing practical applications of this theoretical research. Our work in the School of Computing and Augmented Intelligence embodies these ideals.

Theres so much hysteria and noise in the media about AI. Speaking as professional researchers in this field, are we near any truly useful applications that are going to be game changers for life in various industries?

PS: Yes, I think so. Weve already seen what convolutional neural networks did for image recognition and how that has been embedded in everything from phones to security cameras and more. Were going to see a very similar phenomenon going on with large language models. The models have problems, and the main one is a concept called hallucinations, which means the models give the wrong answers or information. We also cant have any strong safety guarantees with large language models if you cant explain where the results came from, which is the same problem with every other neural model. Companies like Google and OpenAI are doing a lot of testing to mitigate these potential issues that could come out, but theres no way they could test every possible case. With that said, I expect to see things like the context window, or the amount of data you can put in a prompt, expand with large language models in the next year. Thats going to help improve both the training and use of these models. There have been a lot of techniques introduced in the past year that will significantly improve the accuracy in everyday use cases, and I think the public will see a very low error rate. Large language models are crucial in generating computer code, and thats likely to be the most game-changing, impactful result. If we can write code faster, we can inherently innovate faster. Large language models are going to help researchers continue to act as engines of innovation, particularly here in the U.S. where these tools are readily available.

Large language models are crucial in generating computer code, and thats likely to be the most game-changing, impactful result. If we can write code faster, we can inherently innovate faster.

NG: Progress in machine learning has been meteoric. We have seen the rise of generative models for language, images, videos and music in the last few years. There are already economic consequences of these models, which were seeing in industries such as journalism, writing, software engineering, graphic design, law and finance. We may one day see fewer of these kinds of jobs as our efficiency in pursuing this advancement increases, but there are still questions about the accuracy and morality of using such technology and its lasting social and economic impacts. There is some nascent understanding of the physical world in these systems, but they are still far from being efficient when collaborating with human users. I think this technology will change the way we function in society just as introducing computers changed the type of jobs people aspire toward, but researchers are still focused on developing the goal of artificial general intelligence, which is Ai that understands the physical world and functions independently in it. We are still far from such a system, although we have developed impressive tools along the way.

Do you think AIs applications in national security will ever get to a point where the public sees this technology in use, such as the autonomous vehicles being tested on roads in and around Phoenix, or do you think it will stay behind the scenes?

PS: When I ran my startup company, I learned that it was important for AI to be embedded in a solution that everyone understands on a daily basis. Even with autonomous vehicles, the only difference is that theres no driver in the drivers seat. The goal is to get these vehicles to behave like normal cars. But the big exception to all of this is ChatGPT, which has turned the world on its head. Even with these technologies, I have a little bit of doubt that our current interface will be the way we interact with these types of AI going forward, and the people at OpenAI agree.

I see further development in the future to better integrate technology like ChatGPT into a normal workflow. We all have tools we use to get work done, and there are always small costs associated. With ChatGPT, theres the cost of flipping to a new window, logging into the program and waiting for it to respond. If youre using it to craft an email thats only a few sentences long, it may not feel worth it, and then you dont think of this as a tool to make an impact as often as you should. If ChatGPT were more integrated into processes, I think use of it would be different. Its such a compelling technology and I think thats why they were able to release it in this very simple, external chat format.

NG: We use a significant amount of technology developed for national security for public purposes, in applications from the internet to GPS devices. As technology becomes more accessible, it continues to be declassified and used in public settings. I expect the same will happen for most such research products developed by DARPA.

Link:

ASU researchers bridge security and AI - Full Circle

Spatial attention-based residual network for human burn … – Nature.com

Accurate diagnosis of human burns requires a sensitive model. ML and DL are commonly employed in medical imaging for disease diagnosis. ResNeXt, AlexNet, and VGG16 are state-of-the-art deep-learning models frequently utilized for medical image diagnosis. In this study, we evaluated and compared the performance of these models for diagnosing burn images. However, these models showed limited effectiveness in accurate diagnosis of burn degree and distinguishing grafts from non-grafts.

ResNeXt, a deep residual model, consists of 50 layers, while AlexNet and VGG16 are sequential models with eight and 16 layers, respectively. These layers extract features from the burned images during the models training process. Unfortunately, distinguishing between deep dermal and full-thickness burns can be challenging, as they share similar white, dark red, and brown colors. Consequently, highly delicate and stringent methods are required for accurate differentiation. AlexNet and VGG16, being sequential models, mainly extract low-level features, whereas ResNeXt excels in extracting high-dimensional features. A limitation is that these models can only learn positive weight features due to the ReLu activation function. This constraint may hinder their ability to precisely identify critical burn characteristics. The DL models, AlexNet, ResNeXt, VGG16, and InceptionV3 are widely used for medical image diagnosis, however, these models encounter challenges in accurately categorizing burn degrees and differentiating grafts from non-grafts. Finding effective ways to handle these challenges and improve feature extraction could lead to more sensitive and reliable burn diagnosis models.

The ResNeXt model33 influenced the BuRnGANeXt50 model. To construct a BuRnGANeXt50 model, the original ResNeXt models topology is modified. Moreover, the original ResNeXt was created to classify images into several categories with high computation costs. In this study, the method performs a multiclass and binary class classification task. Multiclass classification is used to assess burn severity based on burn depth. After that, based on depth, burns may be broken down into two distinct types: graft and non-graft. Reducing the first layer filter size from 77 to 55 is the first change to the original ResNext models design because a larger filter size resulted in lower pixel intensity in the burnt region. This has led to a rise in the frequency of spurious negative results for both grafts and non-grafts. Furthermore, the convolution sizes of Conv1, Conv2, Conv3, Conv4, and Conv5 are also changed to reduce the computation cost while maintaining cardinality. Furthermore, we applied Leaky ReLu instead of the ReLU activation for faster model convergence. Table 2 also shows that conv2, conv3, and conv4 are shrinking in size. After implementing all modifications, neurons decreased from 23106 to 5106, as shown in Table 3. The detailed architecture of the proposed model is shown in Fig.1.

Topology of BuRnGANeXt50 for human burn diagnosis.

This model has several essential building blocks, including convolution, residual, ReLU, activation, softmax, and flattened layer. The results of groups convolution of neurons inside the same kernel map are summed together by pooling layers, which reduce the input dimensionality and enhance the model performance. The pooling units in the proposed model constitute a grid, with each pixel representing a single voting location, and the value is selected to gain overlap while reducing overfitting. Figure2 describes the structure of the models convolution layer. Polling units form a grid, each pixel representing a single voting place being centered (z times z). In the provided model, we employ the standard CNN with parameters set to (S = z), but we add a charge of (S < z) to increase overlap and decrease overfitting34. The proposed architecture was developed to handle the unique issues of burn diagnosis, emphasizing decreasing overfitting and enhancing model accuracy.

The pooling layers are convolutions in a grouped manner.

The inner dot product is an essential part that neurons perform for the foundation of an artificial neural networks convolutional and fully connected layers. The inner dot product may compute the aggregate transform, as illustrated in Eq.(1).

$$mathop sum limits_{i = 1}^{K} w_{i} rho_{i}$$

(1)

represents the neurons k-channel input vector. Filter weight is given by (w_{i})for i-the neurons. This model replaces the elementary transformations with a more generic function (left( {w_{i} rho_{i} } right)). By expanding along a new dimension, this generic function reduces depth. This model calculates the aggregated transformations as follows:

$${Im }left( rho right) = mathop sum limits_{i = 1}^{{mathbb{C}}} Upsilon_{i} left( rho right)$$

(2)

The function (Upsilon_{i} (rho )) is arbitrarily defined. (Upsilon_{i}) project (rho) into low-dimensional embedding and then change it, similar to a primary neuron. ({mathbb{C}}) represents the number of transforms to be summed in Eq.(2). ({mathbb{C}}) is known as cardinality35. As the residual function, Eq.(2)s aggregated transformation serves36. (Fig.3):

$$x = rho + mathop sum limits_{i = 1}^{{mathbb{C}}} Upsilon_{i} left( rho right)$$

(3)

where (x) is the models predicted result.

Channel and spatial attention modules are depicted in (A) and (B), respectively, in these schematic illustrations.

Finally, at the top of the model a flattened and a global average pooling is added. The Softmax activation classifies burn into binary and multiclass. The softmax optimizer uses the exponent of each output layer to convert logits to probabilities37. The vector (Phi) is the system input, representing the feature set. Our study uses k classification when there are three levels of burn severity (k=3) and two levels of graft versus non-graft (k=2). For predicting classification results, the bias (W_{0} X_{0}) is added to each iteration.

$$p(rho = i|Phi^{left( j right)} ) = frac{{e^{{Phi^{left( j right)} }} }}{{mathop sum nolimits_{i = 0}^{k} e^{{Phi_{k}^{left( j right)} }} }}$$

(4)

$${text{In}};{text{which}};Phi = W_{0} X_{0} + W_{1} X_{1} + ldots + W_{k} X_{k}$$

(5)

The residual attention block, which allows attention to be routed across groups of separate feature maps, is shown in Fig.3. Furthermore, the channels extra feature map groups combine the spatial information of all groups via the spatial attention module, boosting CNNs capacity to represent features. It comprises feature map groups, feature transformation channels, spatial attention algorithms, etc. Convolution procedures can be performed on feature groups, and cardinality specifies the number of feature map groups. A new parameter, "S," indicates the total number of groups in the channel set38 and the number of subgroups in each of the N input feature groups. A channel scheduler is a tool that optimizes the processing of incoming data through channels. This method transforms feature subsets. G=N * S is the formula for the total number of feature groups.

Using Eq.(6), we conduct an essential feature modification on subgroups inside each group after channel shuffling.

$$gleft( {r,i,j} right) = left[ {begin{array}{*{20}c} {cos frac{rpi }{2}} & { - sin frac{rpi }{2}} \ {sin frac{rpi }{2}} & {cos frac{rpi }{2}} \ end{array} } right]left[ {begin{array}{*{20}c} i \ j \ end{array} } right]$$

(6)

Here (0le r<4,left(i,jright)) stands for the original matrixs coordinates. K represents the 33 convolution of the bottleneck block, and Output is written as (y_{s}). Then, for each (x_{s}) input

we have:

$$y_{s} = left{ {begin{array}{*{20}c} {Kleft( {g_{r} left( {x_{s} } right)} right)r,} & {s = 0} \ {Kleft( {g_{r} left( {x_{s} } right)} right) odot y_{0} } & {0 < r = s < 4} \ end{array} } right.$$

(7)

(g& r) here represents the input (x_{s}). (odot) corresponds to element multiplication in the matrixs related feature transformation. Features of x being transformed are shared across the three 33 convolution operators K.

Semantic-specific feature representations can be improved by exploiting the interdependencies among channel graphs. We use the feature maps channels as individual detectors. Figure3A depicts how we send the feature map of the (noin mathrm{1,2},...,N) group ({G}^{no}in {R}^{C/Ntimes Htimes W}) to the channel attention module. As a first step, we use geographic average pooling (GAP) to gather global context information linked to channel statistics39. The 1D channel attention maps ({C}^{no}in {R}^{C/N}) are then inferred using the shared fully connected layers.

$$C^{n} = D_{sigmoid} left( {D_{{{text{Re}} LU}} left( {GAPleft( {G_{n} } right)} right)} right)$$

(8)

("{D}_{sigmoid}and{D}_{mathit{Re}LU}") represents a fully linked layer that uses both "Sigmoid" and "ReLU" as activation functions. At last, Hadamard products are used to infer a groups attention map and the corresponding input features. Then the components from each group are weighted and added together to produce an output feature vector. The final channel attention map

$$C in R^{C/N times H times W} C = mathop sum limits_{n = 1}^{N} left( {C^{n} odot G^{n} } right)$$

(9)

Each groups 11 convolution kernel weight is multiplied by the 33 kernel weight from the subgroups convolutional layer. The global feature dependency is preserved by adding the groups channel attention weights, which all add up to the same value.

A spatial attention module is used to synthesize spatial links and increase the spatial size of associated features. The channel attention module is separate from that component. The spatial information of feature maps is first aggregated using global average pooling (GAP) and maximum global pooling (GMP)39 to obtain two distinct contextual descriptors. Next, by joining (GAP(C)in {R}^{1times Htimes W}andGMP(C)in {R}^{1times Htimes W}) connect to get ({S}_{c}in {R}^{2times Htimes W}).

$$S_{c} = GAPleft( C right) + GMPleft( C right)$$

(10)

The plus sign +denotes a linked feature map. The regular convolutional layer retrieves the spatial dimensional weight information to round things out. (S_{conv}) Final spatial attention map (Sin {R}^{C/Ntimes Htimes W}) is obtained by element-wise multiplying the input feature map (C) with itself.

$$S = Conv_{3 times 3} left( {S_{C} } right) odot C$$

(11)

("Con{v}_{3times 3}") means regular convolution, while "Sigmoid" denotes the activation function.

Leaky ReLU activation-based deep learning models do not rely on input normalization for saturation. Neurons in this model are more efficient at learning from negative inputs. Despite this, neural activity is calculated ({alpha }_{u,v}^{i}) At a point ((u,v)) by using the kernel (i), which facilitates generalization. The ReLU nonlinearity is then implemented. The ReLU nonlinearity is then implemented. The response normalized ({alpha }_{u,v}^{i}) is determined using the provided Eq.(12).

$$b_{u,v}^{i} = frac{{alpha_{u,v}^{i} }}{{left( {t + alpha mathop sum nolimits_{j - max (0,i,n/2)}^{min (N,1,i + n/2)} (alpha_{u,v}^{j} )^{2} } right)^{beta } }}$$

(12)

where (N) are the total number of layers and (t,alpha ,n,beta) are constants? This (sum {}) is computed for each of the (n) neighboring40. We trained the network using a (100 times 100 times 3) picture and the original ResNeXt CNN topologys cardinality hyper-parameter ({mathbb{C}}=32). The algorithm of the proposed method is shown below.

Algorithm of the proposed method.

All authors contributed to the conception and design of the study. All authors read and approved the final manuscript.

Excerpt from:

Spatial attention-based residual network for human burn ... - Nature.com

The TALOS-AI4SSH project: Expanding research and innovation … – Innovation News Network

The TALOS-AI4SSH project is setting up a Centre of Excellence in Digital Humanities at the University of Crete, Greece.

The TALOS-Artificial Intelligence for Humanities and Social Sciences (TALOS-AI4SSH) project gets its name from ancient Greek mythology. Talos was an ancient bronze giant whose mission was to protect Crete from pirates. Nowadays, Talos is usually considered the symbol of Artificial Intelligence (AI).

The TALOS-AI4SSH project received 2.5m in funding from the European Commission (HORIZON-WIDERA-2022-TALENTS-01-01-ERA Chairs) in order to set up a Centre of Excellence in Digital Humanities at the University of Crete, Greece.

The new centre will be integrated at the University of Crete Research Centre (UCRC) for the Humanities, the Social and Education Sciences.

Established in 1978, the University of Crete (UoC), ranked among the top Greek universities, is a young public educational institution sited in a region rich in ancient and modern Mediterranean cultures.

Currently, around 20,000 undergraduate and graduate students study at the UoC, at the Schools of Philosophy, Education, Social Sciences, Sciences & Technology, and Medicine. They are taught by outward-looking academic staff committed to quality in teaching, research, and community partnerships.

The UoC is the only Greek University, to date, that has been awarded with the HR Excellence in Research logo.

The objectives set by the TALOS-AI4SSH project align with the EU Social Sciences and Humanities (SSH) integration and digital agenda.

The project will expand the research and innovation potential, generate new knowledge, strengthen innovation and knowledge transfer activities, create industry-driven projects, and build capacity through combined academic-industry training.

In close collaboration with local and global industry partners, the TALOS-AI4SSH project will support SSH-informed evidence-based AI policymaking, develop key combined digital SSH competencies, and produce interdisciplinary solutions to both societal and technological AI issues.

The vision of the TALOS-AI4SSH project is to train a new generation of hybrid scholars who will combine knowledge and skills from both the humanities and computer science sectors.

A vision is needed for new educational curricula, combining knowledge from the humanities, the social sciences, and engineering studies. Education on computer science /informatics and its societal impact must start as early as possible. Students should learn to combine information-technology skills with awareness of the ethical and societal issues at stake. (Vienna Manifesto on Digital Humanities)

The traditional separation between humanities, arts, social sciences and STEM (Science, Technology, Engineering and Mathematics) is not suitable for the needs of the digital age. Artificial Intelligence is not a STEM discipline. It is in essence trans-disciplinary and requires a spectrum of capabilities that is not covered by current education curricula. It is urgent to redesign studies. This also gives a unique opportunity to truly achieve inclusion and diversity across academic fields. (V. Dignum, Responsible AI)

TALOS mission focuses on the need for AI that is reflective about the nature of intelligence and humanity. It also addresses the requirement for SSH that is transformative and embraces the methodological advances of AI to generate innovation and impact.

TALOS will combine the strengths of neural networks with symbolic reasoning to create a scalable, sustainable, and explainable representation of knowledge in the humanities and the social sciences.

The TALOS-AI4SSH project was selected for funding in the HORIZON-WIDERA call ERA Chairs. This programme aims to help universities to promote structural changes and to achieve excellence on a sustainable basis.

This is done under the guidance of a prominent researcher. The ERA Chair Professor then sets up a team and undertakes tasks, such as supervision, research, teaching etc., that contribute to the educational and research upgrading of the host institution.

Christophe Roche is the ERA chair holder at the University of Crete. He is a Professor in Artificial Intelligence at the University of Savoie Mont-Blanc (France), a special appointment Professor at the University of Liaocheng (China, National Talent), and Head of the Condillac research group (France), and KETRC Lab (China).

He started his career in the private research sector in Artificial Intelligence (Paris) and taught in Switzerland (Neuchatel) and in Paris for over ten years (INALCO, Paris Dauphine). He was researcher and lecturer at the University Nova of Lisbon (2009-2022).

His domains of interest are knowledge engineering (ontology), linguistics (terminology), digital humanities and the semantic web.

In 2007, he created the TOTh international conference on terminology and ontology, which he has been organising every year since, also chairing its scientific committee. He also chairs the terminology committee of AFNOR (Paris) in relation with ISO.

Roche has set up and participated in 14 international projects, including ten EU-funded projects. He has set up PhD courses and summer schools in France, Portugal, and China in Artificial Intelligence, linguistics, and digital humanities.

Since March 2023, Professor Roche has joined the Department of Philology of the UoC on a five-year contract.

The TALOS-AI4SSH project is setting up a Centre of Excellence in Digital Humanities. The activities of the new centre are being directed by Professor Roche and include:

The TALOS-AI4SSH project is expected to have a significant impact in the following areas:

The project is designed to:

The key research topics are:

1. Semantic annotation of texts; 2. Classifying and naming objects; 3. Pattern recognition in literary/journalistic corpora; 4. Digitisation and datafication of education; 5. Hybrid AI-NLP Software for SSH; and 6. Standards for SSH.

The TALOS-AI4SSH Project is funded by the Horizon Europe Programme under grant agreement No. 101087269.

Please note, this article will also appear in the fifteenthedition of ourquarterly publication.

Go to this partner's profile page to learn more about them

Read more here:

The TALOS-AI4SSH project: Expanding research and innovation ... - Innovation News Network

Industry 4.0: The Transformation of Production – Fagen wasanni

Modern society is undergoing a significant transformation in the production of goods, known as Industry 4.0 (I4.0). This transformation is driven by the digitization of society and manufacturing, and the integration of new, modern, smart, and disruptive technologies.

Industry 4.0 refers to the intelligent networking of machines and processes using new information and communication technologies such as the Internet of Things (IoT), cloud computing, fog computing, and Artificial Intelligence (AI). These technologies have increased the speed and breadth of knowledge in the modern economy and society.

From the first industrial revolution, characterized by mechanization through water and steam power, to the mass production and assembly lines of the second industrial revolution using electricity, and the adoption of computers and automation in the third industrial revolution, Industry 4.0 builds upon these advancements. It enhances them with smart and autonomous systems that work in IoT environments, utilizing digital data analyzed by machine learning technology.

The combination of cyber-physical systems (CPS), IoT, AI, and machine learning is making Industry 4.0 possible and turning the concept of the smart factory into a reality. Smart machines, with access to more data, enable factories to become less wasteful and more efficient and productive. The true power of Industry 4.0 lies in the digital connection between network CPS, creating and sharing information.

CPS are systems that integrate constituents from the cyber and physical domains. They monitor and control physical processes through a network of actuators and sensors and can be implemented on various scales. CPS form the basis of smart machines in Industry 4.0 and use modern control systems and IoT environments.

Smart cyber-physical systems (SCPS) are the next generation of computing applications, integrating communication, computation, process control, and AI technologies in a transparent and novel way. SCPS exhibit highly cooperative behavior, self-awareness, self-adaptation, and self-optimization. They are complex systems that enable various AI techniques and technologies within IoT constraints.

SCPS have evolved beyond traditional CPS definitions and are capable of building awareness, reasoning, and adapting to their environment. They integrate hardware, software, AI, and cyberware technologies to a higher degree than any other system before. AI provides cognition to SCPS, allowing them to model, represent, learn complex interactions, and behaviors.

The implementation of artificial neural networks, specifically the multilayer perceptron artificial neural network (MLPANN), in SCPS enables intelligent and autonomous decision-making. MLPANN was chosen for its simple and easy-to-implement architecture.

In conclusion, Industry 4.0 is revolutionizing the production of goods through the integration of new technologies. CPS and SCPS are at the forefront of this transformation, enabling the development of smart machines and systems. The implementation of AI and neural networks in SCPS enhances their capabilities and paves the way for intelligent and autonomous decision-making.

Read more:

Industry 4.0: The Transformation of Production - Fagen wasanni

Fast Simon Launches Vector Search With Advanced AI for … – GlobeNewswire

LOS ALTOS, Calif., Aug. 04, 2023 (GLOBE NEWSWIRE) -- Fast Simon, the leader in AI-powered shopping optimization, today announced Vector Search with advanced AI for eCommerce. Vector Search is able to handle longer search queries and reduce the return of no results compared to keyword search alone. This makes it easier for eCommerce sites to match buyer intent, personalize the shopping experience, answer questions and make product recommendations.

Rather than matching keywords like most eCommerce search engines, Vector Search uses natural language processing and neural networks to analyze a query. Vector embedding maps the words from the search to a corresponding vector to detect synonyms, intent and ranking, and it clusters concepts to deliver more complete results. For example, the search fall wedding guest dresses for black tie event would return relevant results for long dresses, dark colors and options for sleeves, even if the items werent all tagged with the exact keywords.

While many eCommerce search queries today are just one to two words, Gen Z tends to search differently. They often use full sentences and look for contextual results that match their intent. This shift requires a new approach to search that goes beyond keywords to understand the meaning, said Zohar Gilad, CEO of Fast Simon. As the next generation of shoppers, meeting the expectations of Gen Z is crucial for retailers who want to stay relevant.

Vector Search is Fast Simons latest innovation, helping eCommerce brands enhance the online shopping experience. Key benefits include:

For more information about Vector Search, visit the Fast Simon blog.

About Fast Simon Fast Simon is the leader in AI-powered shopping optimization. Its revolutionary platform uniquely integrates shopper, behavioral and store signals for strategic merchandising and optimized shopping experiences that dramatically increase conversions and average order value (AOV). Fast Simon powers shopping optimization at thousands of fast-growing merchants and sophisticatedbrands, including Steve Madden, Natural Life and Motherhood. Fast Simon integrates seamlessly with all major eCommerce platforms, including Shopify, BigCommerce, Magento, Microsoft Dynamics and WooCommerce.

Media Contact: Liesse Jayalath Look Left Marketing fastsimon@lookleftmarketing.com

View post:

Fast Simon Launches Vector Search With Advanced AI for ... - GlobeNewswire

Research on key acoustic characteristics of soundscapes of the … – Nature.com

The term "soundscape" was first coined by Murray Schafer in his book "The tuning of the world" published in 19971,2. In 2014, the International Organization for Standardization systematically elaborated the definition of the soundscape: "acoustic environment as perceived or experienced and/or understood by a person or people, in context"3. It also defined the constituent elements of the soundscape as sound elements, environmental elements and audio receivers. The physical characteristics of sound include loudness, pitch, and timbre. In the discipline of soundscape ecology, sound is classified into three distinct types: biophonies, geophonies and anthrophonies4,5.

Classical Chinese gardens reflect the profound metaphysical beauty of Chinese culture in scrupulous garden design, and they are a significant component of world cultural heritage. The creation of classical Chinese gardens places a strong emphasis on crafting a multisensorial experience through the sensescape, in which soundscape plays a critical role6. Early studies have revealed a remarkable consistency in the adoption of similar soundscapes and sound sources in classical Chinese gardens7. However, most of these studies provided only a summary and categorization, without further analysing the physical characteristics of these sound sources and the reasoning behind the preference matrix.

Most of the soundscape research focuses on loudness, despite the fact that physical characteristics of sound also include frequency, timbre, and duration. Scholars have pointed out that once the loudness of sound remains within peoples comfort zone, evaluation of soundscape will mainly depend on the type of sound source and personal subjective preference, while ignoring the influence of sound frequency, timbre and other physical properties on preference8,9. Despite the profound impact of frequency attributes on human sound perception and the growing recognition of their therapeutic effects on physical and mental well-being10, relevant research in this domain remains relatively sparse. Many scholars use soundscape data, combined with theories and methods of sociology, psychology or physiology to evaluate soundscape. On the one hand, such as Hunte and Jo, etc. concluded that the sound of water, wind, birds and other natural sounds have healing effects on human beings, but did not explore the healing mechanism of these soundscapes in depth; on the other hand11,12, Casc detailed standardizes the standard of using the frequency characteristics of sound to measure species diversity, and discusses the role of sound frequency in the study of species diversity13. This study takes the most recorded 12 soundscape in classical Chinese gardens as the research object, analyzes its frequency characteristics, discusses its healing mechanism, and provides a frequency perspective for the study of soundscape healing mechanism. Audio data for these identified soundscapes is sourced from the BBC Sound Effects website, allowing for an acoustic analysis that focuses solely on the frequency dimension, while mitigating the influence of the physical variable of loudness. By analyzing sound spectrograms, Bai et al.14 divided sound into two types based on sound duration: discrete and continuous. Discrete sounds tend to be associated with musical melodies; a melody is composed of two or more tones, which are discrete vocal events with pitch15. Continuous sound, on the other hand, forms a spectrogram through methods such as Fourier transform16. The spectrogram is the basis for distinguishing between colored noise and white noise17. Not all noises are harmful to human health. Pink noise and white noise can provide a soothing quality, by masking out disturbing sounds from the external environment to inhibit the activation of brain activity, which is evident from the reduced complexity of electroencephalogram (EEG) recordings18.

The ancient Chinese culture attached great importance to sound perception. In ancient Chinese idioms, when the words "ear" and "eye" are mentioned together in an idiom, the word "ear" is always placed before the word "eye"; the sound comes first, and the form follows. Soundscapes have also been recorded in various ancient texts. Wu et al.6 have conducted an extensive literature review on the soundscape present in the "Book of Songs" (also known as Shijing), and found that 86 of the 305 poems in the "Book of Songs" involve soundscapes.

In classical Chinese gardens, the creation and crafting of soundscapes is a meticulous and significant art form with a longstanding history. Tang et al.19 have conducted an extensive literature review on the soundscape present in "Yuanye", the first comprehensive garden art monograph in China. They found 21 descriptions of soundscapes, with biophonies (e.g. bird calls) as the most commonly recorded, followed by geophonies (e.g. water and wind sounds), and lastly anthrophonies (e.g. singing and musical performance). Xie and Ge7 also conducted a literature review on the important contemporary works of classical Chinese gardens, and found a similar ranking of the soundscapes based on their frequency of occurrence20,21,22,23,24,25.

Although the ancients did not systematically acquire relevant acoustic theory and technology, their understanding and experiences have been demonstrated in the mastery of soundscapes in the making of the garden landscape20,23. There are many well-known soundscapes in existing classical Chinese gardens. In the Humble Administrators Garden, the "Pine Wind Pavilion" (Songfengting) has a horizontal banner inscribed with the phrase "Listen to the Pine Wind", which is an extract from the poem "I love the pine wind, and the courtyard is planted with pine trees. Every time I hear the sound, I am delighted." Fig.1a "Rain Pavilion" (Tingyuxuan) has a pond filled with lotus flowers in the front, and with Musa trees and bamboo planted by the side. When raindrops fall on these different plants, they produce responses in listeners; often, these responses will be highly individual (i.e. differ between people) (Fig.1b and c). Chengde Mountain Resort has a building dedicated to listening to the wind blowing through an old-growth pine forest (Fig.1d). The "Orioles Singing in the Willow" (Liulangwenying) is a lakefront park located on the southeast bank of the West Lake, which is famous for bird calls from orioles (Fig.1e). The Octave Stream (Bayinjian) in Jichang Garden demonstrates skilful stonework, creating an enclosed environment which isolates noise from the outside world and amplifies the trickling sound of spring water (Fig.1f).Soundscape research centers on two major aspects: the objective assessment of the physical characteristics of sound, and the subjective evaluation of the perception of sound. However, most research on soundscape perception focused mainly on the perceived loudness, despite the fact that frequency and timbre are equally important components of the physical characteristics of sound26. In this paper, we conducted an extensive systematic literature review of related soundscape research and selected 12 sound sources that are most typically present in classical Chinese gardens. We then collected the respective audio samples from the BBCs library of Sound Effects to classify these 12 audio samples into discrete and continuous sounds based on spectrogram analyses. Their frequency distribution was further analyzed by the Pitch Estimation Algorithm and LSTM neural network noise type judgment method based on the theories of musical tones, melody and colored noise classification. According to the results, the frequency distribution characteristics of discrete sounds indicate a pitch change, while continuous sounds show white noise or pink noise. The preference mechanism for these sound sources from the perspective of healing and health benefits will be discussed in this paper. This study presents a novel approach by explaining the physical attributes and preference mechanisms of soundscapes in classical gardens from the perspective of sound frequency. The innovation of this study can be observed in two aspects.

Site photos of existing classical Chinese gardens with distinct soundscapes. Inset images show soundscape representatives of the 6 in existing classical Chinese gardens.From left to right: (a) Tingyuxuan; (b) Songfengting; (c) Liutingge; (d) Wanhesongfengdian; (e) Liulangwenying; (f) Bayinjian.

Firstly, methodological innovation: The utilization of LSTM neural networks and principles from music theory to analyze audio data led to the discovery of frequency-based features that are favored by individuals. Specifically, it was found that the frequency distribution of continuous sound sources conforms to the distribution patterns of white noise and pink noise. This approach provides a unique methodology for understanding and interpreting the frequency characteristics of preferred sounds, that is discrete sounds exhibit two or more variations in pitch. Secondly, theoretical innovation: The study offers an explanation from a therapeutic perspective as to why people are drawn to sounds with these frequency features. By highlighting the potential healing effects of these sounds, the research contributes to the improvement of urban green spaces and the enhancement of residents physical and mental well-being. The study provides a frequency-based perspective and serves as a valuable reference for the enhancement of acoustic environments in urban green spaces and the promotion of residents overall health.

Visit link:

Research on key acoustic characteristics of soundscapes of the ... - Nature.com

Signal and noise: how timing measurements and AI are improving … – ATLAS Experiment at CERN

The Large Hadron Collider (LHC) doesnt collide one particle at a time it hurls together more than one hundred billion proton pairs every twenty-five nanoseconds! Most pass by each other and continue on their way, but many collisions can happen at the same time. Physicists then have to disentangle the collisions with rare or interesting signatures from the noise of overlapping pile-up collisions. This major experimental challenge has become even more important in recent data-taking runs of the LHC, as higher collision rates result in more pile-up collisions.

The ATLAS Collaboration recently presented two new results explaining how detector timing measurements and calorimeter signal calibration using artificial intelligence (AI) are being used to further improve the quality of data recorded by the experiment.

The ATLAS Liquid Argon and Tile Calorimeters are particularly susceptible to pile-up, as signals from these sub-detectors can take longer to read out than there is time between LHC collisions. During this long read-out period, particles from other collisions can contribute to the noise of the recorded signal. When a particle hits a calorimeter, it sets off a shower of secondary particles that deposit their energy in the detectors. The energy of the initial particle can be measured by reconstructing this shower. To do this, the ATLAS calorimeter is split into finely-segmented 3D cells that allow more information to be collected about the showers development. The shower is reconstructed as a cluster of many cells, using an algorithm that first spots cells with strong signals, and then collects neighbouring cells to get a complete picture of the shower (see Figure 1). Particle showers appear as groups of pixels, and the colour changes from blue to red as the signal strength increases. While some of these signals might come from an interesting, rare particle, many others are likely to be pile-up.

The ATLAS Collaboration has improved its calorimeter cell clustering algorithm to better reject pile-up while retaining interesting signals. Besides particle energies, calorimeters also measure the time at which the energy was deposited in their cells (see Figure 2). This is centred around the LHC collision clock, and any signal measured more than 12.5 billionths of a second away from the expected collision time is likely from a different bunch of protons. Excluding these out-of-time cells from the cluster is a powerful way to suppress pile-up, reducing noisy contributions by up to 80%. As a bonus, removing these unwanted clusters means ATLAS will need 6% less storage for the Run 3 data. This may seem like a small reduction, but every little bit adds up when dealing with LHC-scale datasets!

Once the interesting signals are separated from pile-up, the next step is calibration. Different particles make different kinds of showers in the calorimeters: electromagnetic showers produced by photons and electrons are narrow and dense, while hadronic showers from strongly-interacting particles like pions are larger and more diffuse. A showers signal depends on the type of interaction that produced it: hadronic showers leave less of a signal than electromagnetic ones of the same energy. Calibrating the energy of clusters to account for this is an important step when correctly reconstructing the energy flow of an event. Luckily, many features of clusters like their density and depth in the detector give information about the type of shower being measured. For reliable cluster energy calibration, many of these features must be considered at once making it a natural place to apply modern AI algorithms.

ATLAS physicists recently calibrated the energy scale of calorimeter cell clusters with Deep Neural Networks (DNN in Figure 3) and Bayesian Neural Networks (BNN in Figure 3), and found that AI algorithms can significantly improve the accuracy and precision of the calibration when compared to earlier methods (LCW hadronic scale in Figure 3), which used a tabulated calibration that only considered a limited number of features. Using AI allows much more information per-cluster to be used, resulting in a calibration that is also more resilient to the effects of pile-up.

With high-fidelity pictures of collision events in hand, physicists will be able to refine their searches for new particles and precision measurements. However, this task will be made much more challenging in the high-pileup environment of the High-Luminosity LHC. To meet that challenge, ATLAS physicists will be testing new and creative approaches to the event reconstruction throughout Run 3 of the LHC.

Follow this link:

Signal and noise: how timing measurements and AI are improving ... - ATLAS Experiment at CERN

Elon Musk Hints at Finalizing Tesla FSD V12 Code, Needs More … – autoevolution

With the FSD Beta V11.4.6 making the rounds as one of the best builds of Tesla's self-driving software, people are already euphoric about the next iteration. Elon Musk confirmed that the V12 is already in the alpha stage, and Tesla is working on the final piece of the FSD AI puzzle: vehicle control. Elon Musk seems fascinated by Tesla's full self-driving software, which he repeatedly predicted to be complete by year's end. He already said this twice in 2023, which should add more weight to this prediction. Still, people learned that Musk's predictions are to be taken with a grain of salt. What it's true is that Musk appears to devote less time to Twitter and more to Tesla. He confirmed on August 2 that he "was buried in Tesla work all day," and the FSD software was undoubtedly high on his to-do list.

In a new tweet, or whatever it might be called these days, Musk revealed that Tesla is working on "the final piece of the FSD AI puzzle," which apparently is vehicle control. Instead of direct coded instructions, Tesla will rely more on neural networks for vehicle control. More than that, Musk confirmed that Tesla is already training these neural networks. However, the progress is slow because the EV maker is currently "training compute-constrained."

This is interesting, as most talks until now revolved around data gathering and how many miles the Tesla fleet was covering. Musk implied that Tesla now has more data than it can chew and obviously needs more computing power to accelerate development. At the moment, the whole AI industry is constrained because of a shortage of Nvidia GPUs. Tesla also admitted during the second-quarter earnings call that it can't get enough cards, and that's the main reason why Tesla deployed the Dojo supercomputer using processors designed in-house.

The Dojo supercomputer thrilled many Tesla fans as if it was a magic solution to the FSD woes. During the second-quarter earnings call, Elon poured cold water on these hopes, admitting that he would've preferred to have more Nvidia GPUs instead. Dojo is still useful to complement Tesla's Nvidia supercomputer, being optimized for processing large amounts of video images.

Previously, Musk said that V12 of the Tesla FSD software will be end-to-end AI, "from images in, to steering, brakes & acceleration out." What he meant was that the neural networks would be used throughout, from processing the images caught by the car's cameras to controlling the car's movements. In his latest tweet, Musk revealed that switching to neural networks would allow Tesla to drop over 300 thousand lines of C++ control code. This is a massive simplification, which should improve speed and accuracy by an order of magnitude.

More than that, the FSD V12 will be "the thing," dropping the "beta" from its name and becoming commercial software. This shows Musk's confidence that the next software iteration will prove safe enough to be installed by anyone who pays for the FSD capability. Tesla CEO already tested an alpha build of V12 and considered it "mind-blowing," something he also said about previous versions of the FSD Beta.

Continued here:

Elon Musk Hints at Finalizing Tesla FSD V12 Code, Needs More ... - autoevolution