## Antonino Tumeo

List of Publications by Year in descending order

Source: https://exaly.com/author-pdf/2132307/publications.pdf

Version: 2024-02-01

101 932 11 papers citations h-index

106 106 106 582 all docs citations times ranked citing authors

752698

20

g-index

| #  | Article                                                                                                                                                                    | IF  | CITATIONS |
|----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 1  | Svelto: High-Level Synthesis of Multi-Threaded Accelerators for Graph Analytics. IEEE Transactions on Computers, 2022, 71, 520-533.                                        | 3.4 | 11        |
| 2  | DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs., 2022,,.                                                                                          |     | 2         |
| 3  | ASAP., 2022,,.                                                                                                                                                             |     | 3         |
| 4  | Energy characterization of graph workloads. Sustainable Computing: Informatics and Systems, 2021, 29, 100465.                                                              | 2.2 | 2         |
| 5  | HAM: Hotspot-Aware Manager for Improving Communications With 3D-Stacked Memory. IEEE Transactions on Computers, 2021, 70, 833-848.                                         | 3.4 | 2         |
| 6  | OpenCGRA: Democratizing Coarse-Grained Reconfigurable Arrays. , 2021, , .                                                                                                  |     | 12        |
| 7  | Towards Automatic and Agile AI/ML Accelerator Design with End-to-End Synthesis. , 2021, , .                                                                                |     | 4         |
| 8  | The future is big graphs. Communications of the ACM, 2021, 64, 62-71.                                                                                                      | 4.5 | 56        |
| 9  | ARENA: Asynchronous Reconfigurable Accelerator Ring to Enable Data-Centric Parallel Computing. IEEE Transactions on Parallel and Distributed Systems, 2021, 32, 2880-2892. | 5.6 | 11        |
| 10 | EXAGRAPH: Graph and combinatorial methods for enabling exascale applications. International Journal of High Performance Computing Applications, 2021, 35, 553-571.         | 3.7 | 9         |
| 11 | Invited: Bambu: an Open-Source Research Framework for the High-Level Synthesis of Complex Applications. , 2021, , .                                                        |     | 33        |
| 12 | DynPaC: Coarse-Grained, Dynamic, and Partially Reconfigurable Array for Streaming Applications., 2021,,.                                                                   |     | 3         |
| 13 | Automated Generation of Integrated Digital and Spiking Neuromorphic Machine Learning Accelerators., 2021,,.                                                                |     | 8         |
| 14 | Invited: Software Defined Accelerators From Learning Tools Environment. , 2020, , .                                                                                        |     | 2         |
| 15 | SODA., 2020,,.                                                                                                                                                             |     | 5         |
| 16 | Advert: An Asynchronous Runtime for Fine-Grained Network Systems. , 2019, , .                                                                                              |     | 0         |
| 17 | Software defined architectures for data analytics. , 2019, , .                                                                                                             |     | 2         |
| 18 | Guest Editorial: Special Issue on Computing Frontiers. International Journal of Parallel Programming, 2018, 46, 333-335.                                                   | 1.5 | 0         |

| #  | Article                                                                                                                                                                                               | IF  | CITATIONS |
|----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 19 | Community Detection on the GPU. , 2017, , .                                                                                                                                                           |     | 16        |
| 20 | Exploring performance and energy tradeoffs for irregular applications: A case study on the Tilera many-core architecture. Journal of Parallel and Distributed Computing, 2017, 104, 234-251.          | 4.1 | 5         |
| 21 | Pushing the Limits of Irregular Access Patterns on Emerging Network Architecture: A Case Study. , 2017, , .                                                                                           |     | 0         |
| 22 | Introduction to GraML Workshop. , 2017, , .                                                                                                                                                           |     | 0         |
| 23 | Exploring DataVortex Systems for Irregular Applications. , 2017, , .                                                                                                                                  |     | 6         |
| 24 | Exploring Efficient Hardware Support for Applications with Irregular Memory Patterns on Multinode Manycore Architectures. IEEE Transactions on Parallel and Distributed Systems, 2017, 28, 1635-1648. | 5.6 | 4         |
| 25 | Scalable static and dynamic community detection using Grappolo. , 2017, , .                                                                                                                           |     | 19        |
| 26 | A dynamically scheduled architecture for the synthesis of graph methods. , 2016, , .                                                                                                                  |     | 0         |
| 27 | Modeling the Impact of Silicon Photonics on Graph Analytics. , 2016, , .                                                                                                                              |     | 4         |
| 28 | Enabling the high level synthesis of data analytics accelerators. , 2016, , .                                                                                                                         |     | 4         |
| 29 | Efficient synthesis of graph methods. , 2016, , .                                                                                                                                                     |     | 6         |
| 30 | Assessing Advanced Technology in CENATE. , 2016, , .                                                                                                                                                  |     | 2         |
| 31 | Exploring Data Vortex Network Architectures. , 2016, , .                                                                                                                                              |     | 3         |
| 32 | A Dynamically Scheduled Architecture for the Synthesis of Graph Database Queries. , 2016, , .                                                                                                         |     | 0         |
| 33 | In-Memory Graph Databases for Web-Scale Data. Computer, 2015, 48, 24-35.                                                                                                                              | 1.1 | 16        |
| 34 | Optimizing Approximate Weighted Matching on Nvidia Kepler K40., 2015,,.                                                                                                                               |     | 5         |
| 35 | High-Performance, Distributed Dictionary Encoding of RDF Datasets. , 2015, , .                                                                                                                        |     | 0         |
| 36 | High level synthesis of RDF queries for graph analytics. , 2015, , .                                                                                                                                  |     | 5         |

| #  | Article                                                                                                                       | IF  | CITATIONS |
|----|-------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 37 | Inter-procedural resource sharing in High Level Synthesis through function proxies. , 2015, , .                               |     | 7         |
| 38 | Function Proxies for Improved Resource Sharing in High Level Synthesis., 2015, , .                                            |     | 2         |
| 39 | Optimizing irregular applications for energy and performance on the Tilera many-core architecture. , 2015, , .                |     | 3         |
| 40 | Power and performance trade-offs for Space Time Adaptive Processing. , 2015, , .                                              |     | 2         |
| 41 | Irregular Applications: From Architectures to Algorithms [Guest editors' introduction]. Computer, 2015, 48, 14-16.            | 1.1 | 15        |
| 42 | Scaling RDF Triple Stores in Size and Performance. Handbook of Statistics, 2015, 33, 339-362.                                 | 0.6 | 1         |
| 43 | Scaling Irregular Applications through Data Aggregation and Software Multithreading. , 2014, , .                              |     | 19        |
| 44 | Toward a data scalable solution for facilitating discovery of science resources. Parallel Computing, 2014, 40, 682-696.       | 2.1 | 3         |
| 45 | Scaling Semantic Graph Databases in Size and Performance. IEEE Micro, 2014, 34, 16-26.                                        | 1.8 | 13        |
| 46 | An adaptive Memory Interface Controller for improving bandwidth utilization of hybrid and reconfigurable systems. , 2014, , . |     | 2         |
| 47 | High-level synthesis of memory bound and irregular parallel applications with Bambu. , 2014, , .                              |     | 6         |
| 48 | A Flexible CUDA LU-Based Solver for Small, Batched Linear Systems. , 2014, , 87-101.                                          |     | 1         |
| 49 | Ant Colony Optimization for mapping, scheduling and placing in reconfigurable systems. , 2013, , .                            |     | 10        |
| 50 | Composing Data Parallel Code for a SPARQL Graph Engine. , 2013, , .                                                           |     | 1         |
| 51 | Exploring hardware support for scaling irregular applications on multi-node multi-core architectures., 2013,,.                |     | 1         |
| 52 | Accelerating semantic graph databases on commodity clusters. , 2013, , .                                                      |     | 2         |
| 53 | YAPPA: A compiler-based parallelization framework for irregular applications on MPSoCs., 2013,,.                              |     | 0         |
| 54 | Accelerating subsurface transport simulation on heterogeneous clusters. , 2013, , .                                           |     | 5         |

| #  | Article                                                                                                                                                                  | lF  | CITATIONS |
|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 55 | Toward a data scalable solution for facilitating discovery of scientific data resources., 2013,,.                                                                        |     | 2         |
| 56 | Exploring Manycore Multinode Systems for Irregular Applications with FPGA Prototyping. , 2013, , .                                                                       |     | 1         |
| 57 | Exploring manycore multinode systems for irregular applications with FPGA prototyping. , 2013, , .                                                                       |     | 0         |
| 58 | Prototyping hardware support for irregular applications. , 2013, , .                                                                                                     |     | 0         |
| 59 | Power/Performance Trade-Offs of Small Batched LU Based Solvers on GPUs. Lecture Notes in Computer Science, 2013, , 813-825.                                              | 1.3 | 16        |
| 60 | Second Workshop on Irregular Applications: Architectures & Samp; Algorithms - IA < Sup > 3 < / Sup > 2012., 2012., .                                                     |     | 0         |
| 61 | Efficient Sorting on the Tilera Manycore Architecture. , 2012, , .                                                                                                       |     | 5         |
| 62 | Approximate weighted matching on emerging manycore and multithreaded architectures. International Journal of High Performance Computing Applications, 2012, 26, 413-430. | 3.7 | 25        |
| 63 | A Bandwidth-Optimized Multi-core Architecture for Irregular Applications. , 2012, , .                                                                                    |     | 5         |
| 64 | A High Performance Computing Network and System Simulator for the Power Grid: NGNS^2., 2012, , .                                                                         |     | 0         |
| 65 | Aho-Corasick String Matching on Shared and Distributed-Memory Parallel Architectures. IEEE<br>Transactions on Parallel and Distributed Systems, 2012, 23, 436-443.       | 5.6 | 32        |
| 66 | Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer. IEEE Transactions on Parallel and Distributed Systems, 2012, 23, 2266-2279.                    | 5.6 | 7         |
| 67 | Designing Next-Generation Massively Multithreaded Architectures for Irregular Applications.<br>Computer, 2012, 45, 53-61.                                                | 1.1 | 5         |
| 68 | Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT., 2011,,.                                                                         |     | 3         |
| 69 | Irregular applications. , 2011, , .                                                                                                                                      |     | 1         |
| 70 | Towards efficient execution of irregular applications. , 2011, , .                                                                                                       |     | 0         |
| 71 | Experiences with String Matching on the Fermi Architecture. Lecture Notes in Computer Science, 2011, , 26-37.                                                            | 1.3 | 11        |
| 72 | Efficient sparse matrix-matrix multiplication on heterogeneous high performance systems. , 2010, , .                                                                     |     | 9         |

| #  | Article                                                                                                                                                                                                       | IF  | Citations |
|----|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| 73 | Accelerating DNA analysis applications on GPU clusters. , 2010, , .                                                                                                                                           |     | 29        |
| 74 | Ant Colony Heuristic for Mapping and Scheduling Tasks and Communications on Heterogeneous Embedded Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2010, 29, 911-924. | 2.7 | 132       |
| 75 | Efficient pattern matching on GPUs for intrusion detection systems. , 2010, , .                                                                                                                               |     | 32        |
| 76 | A reconfigurable multiprocessor architecture for a reliable face recognition implementation. , 2010, , .                                                                                                      |     | 5         |
| 77 | Mapping and scheduling of parallel C applications with Ant Colony Optimization onto heterogeneous reconfigurable MPSoCs., 2010,,.                                                                             |     | 16        |
| 78 | A Compact Transactional Memory Multiprocessor System on FPGA. , 2010, , .                                                                                                                                     |     | 1         |
| 79 | Speeding-Up Expensive Evaluations in High-Level Synthesis Using Solution Modeling and Fitness Inheritance. Adaptation, Learning, and Optimization, 2010, , 701-723.                                           | 0.6 | 14        |
| 80 | Performance modeling of parallel applications on MPSoCs. , 2009, , .                                                                                                                                          |     | 5         |
| 81 | A multiprocessor self-reconfigurable JPEG2000 encoder. , 2009, , .                                                                                                                                            |     | 5         |
| 82 | Evolutionary algorithms for the mapping of pipelined applications onto heterogeneous embedded systems. , 2009, , .                                                                                            |     | 6         |
| 83 | HW/SW methodologies for synchronization in FPGA multiprocessors. , 2009, , .                                                                                                                                  |     | 12        |
| 84 | Prototyping pipelined applications on a heterogeneous FPGA multiprocessor virtual platform. , 2009, , .                                                                                                       |     | 12        |
| 85 | Performance estimation for task graphs combining sequential path profiling and control dependence regions., 2009,,.                                                                                           |     | 5         |
| 86 | Mapping pipelined applications onto heterogeneous embedded systems. , 2009, , .                                                                                                                               |     | 5         |
| 87 | Improving evolutionary exploration to area-time optimization of FPGA designs. Journal of Systems Architecture, 2008, 54, 1046-1057.                                                                           | 4.3 | 21        |
| 88 | Lightweight DMA management mechanisms for multiprocessors on FPGA., 2008,,.                                                                                                                                   |     | 7         |
| 89 | Ant colony optimization for mapping and scheduling in heterogeneous multiprocessor systems. , 2008,                                                                                                           |     | 18        |
| 90 | A dual-priority real-time multiprocessor system on FPGA for automotive applications. , 2008, , .                                                                                                              |     | 14        |

| #   | Article                                                                                                         | IF | CITATIONS |
|-----|-----------------------------------------------------------------------------------------------------------------|----|-----------|
| 91  | A Dual-Priority Real-Time Multiprocessor System on FPGA for Automotive Applications., 2008,,.                   |    | 5         |
| 92  | A design kit for a fully working shared memory multiprocessor on FPGA. , 2007, , .                              |    | 15        |
| 93  | Fitness inheritance in evolutionary and multi-objective high-level synthesis. , 2007, , .                       |    | 7         |
| 94  | Automatic Parallelization of Sequential Specifications for Symmetric MPSoCs., 2007,, 179-192.                   |    | 7         |
| 95  | A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs. , 2007, , .                                            |    | 31        |
| 96  | An Interrupt Controller for FPGA-based Multiprocessors. , 2007, , .                                             |    | 15        |
| 97  | A Self-Reconfigurable Implementation of the JPEG Encoder. , 2007, , .                                           |    | 5         |
| 98  | An Evolutionary Approach to Area-Time Optimization of FPGA designs. , 2007, , .                                 |    | 13        |
| 99  | An Internal Partial Dynamic Reconfiguration Implementation of the JPEG Encoder for Low-Cost FPGAsb. , 2007, , . |    | 7         |
| 100 | Hardware DWT accelerator for MultiProcessor System-on-Chip on FPGA., 2006,,.                                    |    | 8         |
| 101 | Hardware Architectures for Data-Intensive Computing Problems: A Case Study for String Matching. , 0, , 24-47.   |    | 0         |