David García-Soriano

Google Scholar, DBLP, ORCID, LinkedIn
Headhot of David Garcia-Soriano

Welcome! I am an assistant professor and Serra Húnter fellow at the Technical University of Catalonia. I received my undergraduate degrees in Computer Science (2007) and Mathematics (2009) from the Complutense University of Madrid, and my PhD from CWI / University of Amsterdam (2012) under the supervision of Prof. Dr. Harry Buhrman. Afterwards I became a postdoctoral researcher at Yahoo Labs Barcelona, a lecturer at Pompeu Fabra University and the Autonomous University of Barcelona, and a Senior Researcher at Barcelona Media. From 2018 to 2022 I was a Senior Research Scientist in the Algorithmic Data Analysis group at the Institute for Scientific Interchange (ISI) in Turin, while also leading a scientific collaboration with Intesa SanPaolo on the use of machine learning for hedging and optimizing financial portfolios. Additionally, I have worked as a software engineer at CERN, Google, and Tuenti.

My research emphasizes the design of efficient algorithms with theoretical guarantees for analyzing large datasets, often running in sublinear time. Topics include large-scale optimization, algorithmic fairness, graph mining, property testing, clustering, and dense subgraph discovery. See my list of publications.


Publications

Disclaimer: All publications listed here are subject to their respective publishers' copyrights. The PDF links are provided for personal use and academic purposes only.

Books

  1. Correlation clustering. Francesco Bonchi, David García-Soriano, and Francesco Gullo. Synthesis Lectures on Data Mining and Knowledge Discovery. Springer, 2022. ISBN: 978-3-031-79198-7.
    Correlation Clustering book cover
  2. Query-efficient computation in property testing and learning theory. David García-Soriano. Ipskamp Drukkers, 2012. ISBN: 978-94-6191-233-6. PhD Thesis.

Research Articles

  1. Dense and well-connected subgraph detection in dual networks. Tianyi Chen, Francesco Bonchi, David García-Soriano, Atsushi Miyauchi and Charalampos E. Tsourakakis. In Proceedings of the 22nd SIAM International Conference on Data Mining (SDM), pages 361–369, 2022. [Code]
  2. Finding densest k-connected subgraphs. Francesco Bonchi, David García-Soriano, Atsushi Miyauchi, and Charalampos E. Tsourakakis. Discrete Applied Mathematics, 305:34–47, 2021.
  3. Maxmin-fair ranking: individual fairness under group-fairness constraints. David García-Soriano, and Francesco Bonchi. In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 436–446, 2021. [Code] [Slides]
    Note: The sum ∑u ∈ U wu = 1 in the dual LP (10) should read ∑uK wu = 1. Also example 3 contains a typo: the last two elements in the second ranking should be swapped. That is, r2 = <u2, u1, u3, u6, u4, u8, u5, u7>.
  4. Fair-by-design matching. David García-Soriano, and Francesco Bonchi. Data Mining and Knowledge Discovery, 34(5):1291–1335, 2020. Journal Track of ECML PKDD 2020. [Code] [Slides] [Video]
  5. Query-efficient correlation clustering. David García-Soriano, Konstantin Kutzkov, Francesco Bonchi, and Charalampos E. Tsourakakis. In Proceedings of the 29th International World Wide Web Conference (WWW), pages 1468–1478, 2020. [Slides]
    The last line in the "while" loop of Algorithms QwickCluster and QECC should read R ← R ∖ C.
  6. Secure centrality computation over multiple networks. Gilad Asharov, Francesco Bonchi, David García-Soriano, and Tamir Tassa. In Proceedings of the 26th International World Wide Web Conference (WWW), pages 957–966, 2017.
  7. Graph summarization with quality guarantees. Matteo Riondato, David García-Soriano, and Francesco Bonchi. Data Mining and Knowledge Discovery, 31(2):314–349, 2017. [Code] [Slides]
  8. To be connected, or not to be connected: that is the minimum inefficiency subgraph problem. Natali Ruchansky, Francesco Bonchi, David García-Soriano, Franceso Gullo, and Nicolas Kourtellis. In Proceedings of the 26th ACM International Conference on Information and Knowledge Management (CIKM), pages 879–888, 2017. [Slides]
  9. Spheres of influence for more effective viral marketing. Yasir Mehmood, Francesco Bonchi and David García-Soriano. In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 711–726, 2016. [Slides]
  10. Validation of matching. Ya Le, Eric Bax, Nicola Barbieri, David García-Soriano, Jitesh Mehta, and James Li. In Proceedings of International Joint Conference on Neural Networks (IJCNN), pages 4457–4464, 2016.
  11. The MULTISENSOR project. Dimitris Liparas et al. In Proceedings of the 1st International Workshop on Multimodal Media Data Analytics co-located with the 22nd European Conference on Artificial Intelligence (MMDA@ECAI), pages 1–7, 2016.
  12. The minimum Wiener connector problem. Natali Ruchansky, Francesco Bonchi, David García-Soriano, Franceso Gullo, and Nicolas Kourtellis. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 1587–1602, 2015. [Slides]
  13. The power of both choices: practical load balancing for distributed stream processing engines. Anis Nasir, Gianmarco De Francisci Morales, David García-Soriano, Nicolas Kourtellis, and Marco Serafini. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE), pages 137–148, 2015. Implemented in [Apache Storm]. [Code] [Slides]
  14. Correlation clustering: from theory to practice. Francesco Bonchi, David García-Soriano, and Edo Liberty. Tutorial at the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). [Slides]
  15. Graph summarization with quality guarantees. Matteo Riondato, David García-Soriano, and Francesco Bonchi. In Proceedings of the 14th IEEE International Conference on Data Mining (ICDM), pages 947–952, 2014. Updated journal version available above: see [19].
  16. Triangle counting in streamed graphs via small vertex covers. David García-Soriano and Konstantin Kutzkov. In Proceedings of the 14th SIAM International Conference on Data Mining (SDM), pages 352–360, 2014. [Slides]
  17. Nearly tight bounds for testing function isomorphism. Noga Alon, Eric Blais, Sourav Chakraborty, David García–Soriano, and Arie Matsliah. SIAM Journal on Computing, 42(2):459–493, 2013.
  18. The non-adaptive query complexity of testing k-parities. Harry Buhrman, David García-Soriano, Arie Matsliah, and Ronald de Wolf. Chicago Journal of Theoretical Computer Science, 2013(6), 2013.
  19. Junto-symmetric functions, hypergraph isomorphism, and crunching. Sourav Chakraborty, Eldar Fischer, David García–Soriano, and Arie Matsliah. In Proceedings of the 27th IEEE Conference on Computational Complexity (CCC), pages 148–158, 2012.
  20. Monotonicity testing and shortest-path routing on the cube. Jop Briët, Sourav Chakraborty, David García-Soriano, and Arie Matsliah. Combinatorica, 32, pages 1–19, 2012. [Slides]
  21. Nearly tight bounds for testing function isomorphism. Sourav Chakraborty, David García–Soriano, and Arie Matsliah. In Proceedings of the 22nd ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1683–1702, 2011. [Slides]. Updated journal version available above: see [9].
  22. Cycle detection, order finding and discrete log with jumps. Sourav Chakraborty, David García–Soriano, and Arie Matsliah. In Proceedings of the second Symposium on Innovations in Theoretical Computer Science (ITCS), pages 284–297, 2011. [Slides]
  23. Efficient sample extractors for juntas with applications. Sourav Chakraborty, David García–Soriano, and Arie Matsliah. In Proceedings of the 38th International Colloquium on Automata, Languages and Programming (ICALP), pages 545–556, 2011. [Slides]
  24. Monotonicity testing and shortest-path routing on the cube. Jop Briët, Sourav Chakraborty, David García-Soriano, and Arie Matsliah. In Proceedings of the 14th International Workshop on Randomization and Computation (RANDOM), pages 462–475. Springer-Verlag, 2010. Updated journal version available above: see [6].
  25. Learning parities in the mistake-bound model. Harry Buhrman, David García-Soriano, and Arie Matsliah. Information Processing Letters, 111(1):16–21, 2010. [Slides]

Patents

  1. Metodo per definire un portafoglio di hedging di un portafoglio di asset finanziari. Paola Mosconi, Francesco Parino, David García-Soriano, Arianna Miola, Alessandro Marra, Simone Zola. Intesa Sanpaolo S.p.A. and Intesa Sanpaolo Innovation Center S.p.A. Reference: IT-102021000031970 (2024).
  2. Metodo per definire un portafoglio di hedging di un portafoglio di asset finanziari. David García-Soriano, Paola Mosconi, Simone Zola, Alessandro Marra, Laura Li Puma and Silvia Ronchiadin. Intesa Sanpaolo S.p.A. and Intesa Sanpaolo Innovation Center S.p.A. Reference: IT-102020000031112 (2022).
  3. Information matching and match validation. Eric Bax, Nicola Barbieri, David García-Soriano, and Jitesh Mehta. US-20150347591-A1, Yahoo! Inc. Now US-9947060-B2, EXCALIBUR IP, 2015.

Other

Uso de hardware gráfico para la aceleración de métodos algebraicos de recontrucción. David García-Soriano, Enrique Martín-Martín, David Romero Laorden. [Slides]

Teaching


Awards


Miscellaneous

My Math Genealogy
My Erdös Number

Contact

CS - Departament de Ciències de la Computació
Edifici Omega - 217, Campus Nord [map]
Universitat Politècnica de Catalunya
C/Jordi Girona, 1-3
08034 Barcelona, Spain

E-mail: david.garcia.soriano(at)upc.edu
Phone: +34 93 413 78 63